<i>hackseq</i>: Catalyzing collaboration between biological and computational scientists via hackathon

hackseq Organizing Committee 2016

doi:10.12688/f1000research.10964.2

Home Browse hackseq: Catalyzing collaboration between biological and computational...

ALL Metrics

Views

Downloads

Get PDF

Get XML

Export

▬

✚

Opinion Article

Revised

hackseq: Catalyzing collaboration between biological and computational scientists via hackathon

[version 2; peer review: 2 approved]

hackseq Organizing Committee 2016

PUBLISHED 10 Apr 2017

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Hackathons collection.

This article is included in the Bioinformatics Education and Training Collection collection.

Abstract

hackseq (http://www.hackseq.com) was a genomics hackathon with the aim of bringing together a diverse set of biological and computational scientists to work on collaborative bioinformatics projects. In October 2016, 66 participants from nine nations came together for three days for hackseq and collaborated on nine projects ranging from data visualization to algorithm development. The response from participants was overwhelmingly positive with 100% (n = 54) of survey respondents saying they would like to participate in future hackathons. We detail key steps for others interested in organizing a successful hackathon and report excerpts from each project.

Keywords

Hackathon, Genomics, Bioinformatics, Open Science, Diversity in Science

Corresponding author:

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2017 hackseq Organizing Committee 2016. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).

How to cite: hackseq Organizing Committee 2016. hackseq: Catalyzing collaboration between biological and computational scientists via hackathon [version 2; peer review: 2 approved]. F1000Research 2017, 6:197 (https://doi.org/10.12688/f1000research.10964.2) First published: 28 Feb 2017, 6:197 (https://doi.org/10.12688/f1000research.10964.1) Latest published: 10 Apr 2017, 6:197 (https://doi.org/10.12688/f1000research.10964.2)

Revised Amendments from Version 1

We added a brief description of the project leaders and participants role during the hackathon to the research project summaries. We also expanded the discussion of female participation, as suggested by the referees. We also corrected some typos.

To read any peer review reports and author responses for this article, follow the "read" links in the Open Peer Review table.

Introduction

Technological advances in the biological sciences have led to an increasing availability of so-called ‘-omic’ datasets, allowing fundamental questions in biology to be answered at an unprecedented rate¹. However, these datasets are complex, requiring novel and specialized informatics tools for proper analysis and to overcome the computational bottleneck in research. Open-source bioinformatics tools and pipeline development accelerates the rate of research by allowing the community to both reuse and thoroughly assess such methods. Thus, by solving biological problems in an open and collaborative manner, the field can progress at a faster rate than if code remains unavailable to the larger community².

Hackathons offer a solution to catalyze tool and pipeline development for biological data science, as well as foster interdisciplinary collaborations³. These events aim to solve defined computational problems over a period of days by bringing together small teams of individuals with different and diverse skillsets. Although frequently valuable for the outputs they generate, hackathons have faced criticism due to low levels of diversity amongst participants⁴. We therefore established hackseq, a genomics hackathon collective (http://www.hackseq.com) that aims to promote open science, collaboration and diversity. We placed special emphasis in promoting leadership amongst women, minorities and early-career scientists. The inaugural hackseq event took place over three days in October 2016 in Vancouver (British Columbia, Canada) and was a satellite event to the annual American Society of Human Genetics (ASHG) meeting. Here we report a summary of this hackathon in the hopes of promoting similar events in the future.

hackseq format

hackseq was the first genomics hackathon in Vancouver and was based on the NCBI hackathon format³. Some hackathons can be perceived as high-pressure events exclusive to technically inclined and experienced individuals. We thus took measures to ensure that people of all skill levels and backgrounds were encouraged to apply. We structured hackseq as a three-day event that runs primarily from 8AM – 5PM on the Saturday/Sunday/Monday prior to the 2016 ASHG meeting. The hackseq itinerary is accessible on the hackseq github repository (https://github.com/hackseq/October_2016/blob/master/hackseq_2016_schedule.md).

First, we opened a call for ‘team leaders’ to propose a project and lead a small team at hackseq through social media, such as Twitter, making announcements at the Vancouver Bioinformatics User Group (VanBUG) and bioinformatics.ca, as well as direct email contact with potential leaders. We screened the projects to confirm that their aims and scope would be appropriate for a 72 hour hackathon. For the ten accepted projects we used GitHub as a discussion board, creating issue threads (https://github.com/hackseq/hackseq_projects_2016/issues) for each project, allowing prospective participants to view and discuss the details about each project before applying to join a particular team.

Once we established the ten hackseq projects, we opened the call for participants. Our main goal in recruiting participants was to reach out to a diverse group of individuals and to promote participation of women, minorities and early-career scientists. To this end, we partnered with organizations, such as Society for Canadian Women in Science and Technology (SCWIST) and VanBUG, to attract local participants. To encourage early-career scientist involvement, we contacted undergraduate and graduate-level computational sciences and bioinformatics programs at regional universities. To reach the global scientific audience, we contacted several human genetics societies around the world asking them to email participant application information to their respective mailing lists.

To promote economic diversity and lower the barrier to entry for international participants, we partnered with ASHG to create travel awards based on financial need and/or minority status. hackseq had no registration fee. Lastly, we made announcements on Twitter, Galaxy Project’s events calendar, and various international conferences, such as Bioinformatics Open Source Conference 2016 and the 13^th International Congress of Human Genetics, leading up to the hackathon.

In the participant application form, prospective participants ranked the top four projects on which they would like to work. Participants could apply for travel awards by ASHG and request child care, covered by our budget, to promote participation amongst parents. The organizing committee and the team leaders evaluated the applications based on not only their skill levels, but also their interests and passion for genomics. To ensure well-rounded teams, we considered both the project preferences and skill levels during the team assignment phase, ensuring a balance between novices and expert coders, and biological and computational expertise. All forms developed for hackseq are available online (https://github.com/hackseq/October_2016/blob/master/Forms.md).

By defining the projects and teams beforehand, participants got to know their team members and prepare technical infrastructure. Teams hit the ground running on the first day, beginning work unprompted by the organizers at 8AM of the first day.

hackseq had 66 participants in attendance from nine nations, divided into nine teams ranging from 3 to 10 individuals. Of the accepted ten projects, two team leaders withdrew prior to the hackathon for personal reasons, and one popular project split into two teams, resulting in nine teams. The mode age-category was 30–34 years old (62.5%) for team leaders, and 25–29 years old (58%) for participants (Figure 1A). Graduate students made up the largest fraction of participants with 48.2%, followed by academic staff (15.5%), industrial scientists (13.8%), undergraduates (10.3%), postdoctoral fellows (6.9%) and academic faculty (5.2%). Notably, the team leaders were more likely to be industry scientists (44.4%) or young faculty (22.2%) (Figure 1B). In total, 22 of 62 (35.5%) participants identified as female and 40 as male. A total of 41% self-identified as Caucasian, 40% as Asian or Pacific Islander, and 19% as Arab, Latin American or unspecified (Figures 1C and D).

Figure 1. Participant diversity at hackseq 16.

To measure diversity of hackseq participants, we asked team leaders and participants to self-report their (A) age, (B) current occupation, (C) ethnicity and (D) gender. Data is shown for team leaders (yellow) and participants (blue).

	Category	All Responses	Team Leaders
Gender	Male	40	7
	Female	22	2

Age	20-24	5	0
	25-29	30	1
	30-34	13	5
	35-39	8	2
	40-49	1	0
	50+	1	0

OS	OSX	38	na
	Linux	11	na
	Windows	10	na

Occupation	Undergraduate Student	6	0
	Graduate Student	28	2
	Post-Doc	4	0
	Academic Staff	9	1
	Academic Faculty	3	2
	Industrial Scientist	8	4

Ethnicity	Caucasian	26	na
	Arab	2	na
	Asian / Pacific Islander	25	na
	Latin American	1	na

Dataset 1.hackseq demographics.

De-identified demographic data from hackseq participants in the pre-meeting survey/confirmation of attendance.

(A-Z sorted)	-	Rate your subjective experience at hackseq (not column sorted)	-	-	-	-	-	-	-	-	-	Please rate the usefulness of the various computational infrastructure (1 = no use, 5 = critical)	-	-	-
Please write three single word adjectives to describe your experience at hackseq ( comma-delimited plz )	How did you find the 3-day (8h x 3) format of the hackathon?	Hostile(1)-Collaborative(5)	Boring(1)-Fun(5)	Uninformative(1)-Enlightening(5)	Chaotic(1)-Organized(5)	What can we do at hackseq to improve the experience of future participants?	What about hackseq did you enjoy the most?	Would you participate in an event like hackseq again in the future?	Would you be interested in helping organize a future hackseq event?	Which programming languages and tools did you and your team use during the course of hackseq? (Comma delimited please)	Briefly describe what were the computational requirements (e.g., AWS, ORCA, etc) for you during hackseq.	ORCA	Amazon AWS	On-site volunteers	Which resource(s) did you wish you had access to that was/were not provided?
Amazing, fun,knowledgeable	Too Long	4	4	2	2	Beer on site	Being introduced to new software for analysis to read up upon.	Yes	No thanks	Bash, python	ORCA	3	1	1
Amazing, intense, fun	Just right	4	4	5	4	Better compute options (ORCA was problematic at first)	collaborating	Yes	No thanks	python; bash; bwa; GATK; platypus;	ORCA - would have been helpful to have more local HPC - AWS is difficult for working with large datasets.	5	1	3	More high memory nodes.
Awesome!	Just right	5	5	5	5	Better heating, trash cans	Collaborating with a diverse group	Yes	No thanks	Bash and python	AWS and ORCA	4	5	4	More time from Shaun
collaborative, fun, exciting	Just right	3	5	4	2	Better integration for presentations - I felt bad missing some of them since i was in the middle of working, maybe have them first thing or last thing in the day to get more participation	Collaboration	Yes	No thanks	R, python, bash scripting		3	2	4
collaborative,friendly,stimulating	Just right	5	5	5	5	Better Server availability for compute	Collaboration among team members.	Yes	No thanks	python,platypus,linux,bwa,samtools	We used ORCA, but it was horribly slow and unresponsive, so we instead utilized our own HPCs for analysis. Didn't have time to learn and use the AWS.	1	1	5	A better version of ORCA, WestGrid?
collaborative,fun,enriching	Just right	4	4	5	3	Better support for team leaders	Collaboration, good sprit	Yes	No thanks	R, python, various software/packages	Fairly minimal for our project; we did it all on our laptops.	1	1	3
Collaborative,humbling,fun	Just right	4	5	4	4	can not think of right now	Collaborative atmosphere	Yes	No thanks	python, bash, c++, BLAST	AWS, ORCA	3	5	5
Collaborative,synergy,nerdy	Just right	5	5	5	5	Confusing to have posts at multiple places on GitHub, and on Slack. It would be helpful to have team leaders be able to have input on selecting team members.	connecting with people and have fun	Yes	No thanks	python, R, bedtools, bash		2	1	3
Diverse, enthusiastic, lighthearted	Just right	5	4	4	5	don't know yet	Everyone in my team was very collaborative and we worked great together.	Yes	No thanks	R, python, command line	orca	5	1	3
educational, explorative, supportive	Just right	4	4	3	5	Elimination of early tech issues	Fun environment and lots of cool projects and ideas	Yes	No thanks	python, R, bash	ORCA	2	2	3
Educational, fun, inspiring	Just right	5	5	5	5	Have better abstracts and requirements, and also food to cater to all needs like the vegetarians.	getting to play with new datasets	Yes	No thanks	shell, R	amazon server	5	3	5
enjoyable, interesting, motivating	Just right	4	4	4	4	Have clearer instructions for specific teams.	Happy hour/dinner	Yes	No thanks	samtools, bamtools, python, Flask, BioDalliance, GATK	AWS	3	5	3
Exciting, enlightening, fun-filled	Just right	5	4	5	3	Have the project leads put more information prior to the start so more can be done during the three days of the hackathon.	idea inspiration	Yes	No thanks	R,Shiny	R,Shiny	1	1	1
exciting, grueling, enjoyable	Just right	4	5	5	3	Healthier food choices please!	It was really awesome to see the collective depth of knowledge of everyone here at hackseq.	Yes	No thanks	python,bash,html,css,js,samtools,bedtools,igv,biodalliance,picard,gatk	Variant calling (lots of RAM and processors) and plenty of disk space	3	5	4
fantastic,tiring,interesting,epic,fun,collaborative	Just right	5	5	5	4	How to' guide for team leader (edited response)	learning about new ideas	Yes	No thanks	r shiny JavaScript	Ogans laptop	4	4	4
friendly,creative,delicious	Just right	5	5	5	5	I really can't think of anything. The location worked perfectly, the tech stuff was all great. The coffee/food was as required.	learning experience & collaboration	Yes	No thanks	python 3	ORCA	5	1	3
fun, busy, collaborative	Just right	5	4	5	3	I think it was perfect, maybe having a mic when talking	learning from different people	Yes	No thanks	python3	ORCA	5		4
Fun, collaborative, great	Just right	5	4	5	5	I would eliminate the optional talks, and increase the time by a day or get teams to plan more beforehand	Learnt a lot! Met some awesome people from varied backgrounds.	Yes	No thanks	python, R	AWS	1	5	4	can not think of right now
fun, frenetic, stimulating	Just right	5	5	5	5	I would prune out projects that are not strongly selected by the participants to make sure everyone is very interested in the project they're assigned.	Meeting and working with new people, learning from them	Yes	No thanks	python, ipython notebook, bwa, samtools, longranger (10x genomics' software)	AWS only, single large memory instance	1	5	4
fun, impressed, worked	Just right	5	5	5	5	Internet access crapped out a couple times. Orcha was slow also.	Meeting intelligent and friendly teammates	Yes	Yes	R, AWK, python, BASH	AWS, ORCA	3	5	5	X11 forwarding and tunneling for RSTUDIO IPYTHON
fun, mindblowing, humbling	Just right	5	5	5	5	It was just great!	Meeting new people	Yes	Yes	R, python	Laptop
fun, productive,informative	Just right	5	5	5	4	It's pretty good already.	Meeting new people and working as a team	Yes	Yes	python	ORCA, but due to issues, many of us had to use our own clusters	5	1	4
fun, stressful, exciting	Just right	3	3	3	3	Maybe ensure that teams get evenly manned	Meeting new people in bio	Yes	Yes	bash scripts	AWS	3	4	3	None.
Fun, teamwork, learning	Just right	4	4	5	4	Maybe some more intro-level material for people interested in bioinformatics but are new to it.	Meeting new people with different skill sets and working on an interesting problem	Yes	Yes	python,R,LongRanger,jyupiter	Just AWS for data and jobs, and github for code. Created an HTTP server instance on AWS to host BAMs.	1	5	5
fun,creavity,collaborative	Just right	4	4	4	5	More information in terms of computing resources in advance of the hackathon, so that we don't waste time figuring that out at the event.	Meeting other people and opening my mind to new ideas	Yes	Yes	Shell, python, heaps of tools	ORCA was very slow to start with so a lot of us ended up using our home facility's HPC	3	1	4
fun,hectic,challenging	Just right	5	4	5	5	More space to hack late	Meeting people, exploring technologies	Yes	Yes	python	ORCA	5	1	5	nothing
funny, friendly, interesting	Just right	5	4	5	4	More teams. Option to combine teams at any point	Meeting so many people from other projects doing great amazing things! Sharung ideas and solutions too	Yes	Yes	R, plotly and Shiny	Nothing.	1	1	1	Individual rooms for each team
good,nerdy,fun	Just right	5	5	5	5	not overlap workshop with the teamwork, sometimes, one can not join workshop since the team is working together	People	Yes	Yes	R, python	ORCA, AWS	5	5	4
great, learning, team-building	Just right	3	3	4	2	Nothing! This was really perfect.	Pizza on site	Yes	Yes	python, velvet	AWS	1	5	3	A ready-set-go AWS instance & background material on the project
informative, fun, disorganized	Just right	4	5	5	4	Overnight areas onsite	Ribbing my teammates	Yes	Yes	python and R	AWS	3	5	5	none
informative, fun, organized	Just right	5	5	5	4	Prep by participants before	Selected projects beforehand, no internet problems, excellent venue/food/organization	Yes	Yes	R,shiny,plotly	internet and a laptop with R studio and git	1	1	2	none
informative, learning and wonderful	Just right	5	5	5	3	Protein in the breakfasts!	team work and learn from each other.	Yes	Yes	python	ORCA	3	1	2	I just wanted ORCA to work better
Informative,Amazing,Codejockey	Just right	4	3	5	3	Put an address and a map on the website. Make sure information is communicated on all channels (e-mail, Slack, etc.). Don't overrun on the talks. Sort out all computational infrastructure.	Teamwork	Yes	Yes	python	AWS, 1 node	1	5	3	None
insightful, exciting, worth-it	Just right	5	4	5	5	Spread the news	teamwork	Yes	Yes	R	Just laptop	1	1	1	none
inspiration, collaboration, awesome	Just right	4	3	3	4	start later in the day, warmer space, more chances to get to know team in the beginning	Testing out crazy new ideas	Yes	Yes	R, python	ORCA	3	3	4
inspiring,challenging,fun	Just right	5	5	5	4	Students were passing through area during the last day	The challenge	Yes	Yes	BASH, python, Bedtools, Samtools, R, Junyper Notebooks (formerly ipython), GoogleDocs	AWS was essentially for us as we needed over 4T of storage space, many nodes for processing data and a lot of RAM	2	5	2
instructive,fun,intense	Just right	5	5	5	3	The exact problem wasn't defined or explained very well beforehand. I had to spend the first day learning instead of coding.	The collaboration and environment	Yes	Yes	python,bash,R	Our own clusters, orca	3	3	3	HPC
intense, enlightening, enriched	Just right	5	5	5	1	The initial expectations within the team could be made more accessible - slack communication got a bit too chaotic to keep track of prior to the hackathon	The collaborative aspect of delivering a project using different skill sets in the group.	Yes	Yes	R, shiny	our beloved laptops	1	1	1
intense,dizzying,learningerific	Just right	5	5	5	4	Travel grants, extra time (outside of hacking) for events/workshops	The novelty of projects and team compositions just perfect to handle to those situations.	Yes	Yes	python3,PLINK1.9,IMPUTE2,bcftools1.3	ORCA and a few heavy computational calculations on clusters at institutions back home.	5	1	4	Perhaps the ability to get a few hundred hours of CPU time on an easy to access system such as ORCA.
intense,productive,exciting	Just right	4	4	5	5	T-shirts?	the opportunity to work with an expert and collaborate to resolve a relevant scientific problem	Yes	Yes	python, R, pysam	AWS. Fairly light compute requirements in our case.		5		Biggest IT issue: no open ports to AWS instance -- made it hard to get ad-hoc servers running (ipython notebook, rstudio, http mirroring, etc). Per-user accounts on AWS: needed to spend a bunch of time configuring these. (may just be my lack of experience w/ AWS), would be good to have in HOWTO?
intense; unexpected; fun	Just right	5	5	5	5	Unsure if people with my level of inexperience have much of a place in teams	The team bonding	Yes	Yes	python (conda, pyasm etc.), ipython, bash, make, longranger (10x genomics), R	AWS, various servers on the machine.	1	5	4	More time to establish accounts on the machine prior to the start. We were all using same user account and it could possibly lead to a disaster.
interactive, friendly,rewarding	Just right	5	5	5	5	We could probably have more information about the project before hand so we are able to select our preferences better	The wonderful people!	Yes	Yes	R package, python			5	5
interesting, intense, fun	Just right	5	5	5	5		Three days was a great amount of time - wonderful to meet with people.	Yes	YES!	bash,pytho,web	AWS	1	5	4
Interesting,conceptual,algorithmic	Just right	4	3	5	4		Vancouver	Yes	YES!	python3,plink,R,vcftools,bcftools,impute2	ORCA and cluster at one of the participants institutions	4	1	4
Interesting,educational,cool	Just right	5	5	4	5		variety of projects	Yes	YES!	R, python, bash, memesuite	Didn't use AWS that much, it was mostly run on our laptops	3	3	5	None
interesting,great,fantastic	Just right	5	5	5	5		Working as a team, brainstorming ideas, full day sessions to focus on work, Meeting deadline	Yes	YES!	python,Ipython notebook,bwa,velvet	Used a single node in AWS. Worked brilliantly	1	5	5	Really nothing. The 10x guys got basic bioinformatics tools set up on the AWS node very quickly. Might have been useful for other teams to make that easier but we had absolutely no problems
Learning, inclusive, productive	Just right	3	4	3	3		Working with a new set of people, the wrap ups at the end of the day and the venue.	Yes	YES!	bash, python, R, bedtools, ChIPSeeker, MotifGP	We didn't use AWS or ORCA. It wasn't clear how to access ORCA, and AWS required some set-up that we didn't want to go through. I would recommend that if you're going to use AWS in a future hackathon, then have the instances ready to go so that the participants don't have to worry about configuring the servers.	1	1	5	Preconfigured cloud instances.
productive,ambitious,exhausting	Too Long	4	3	5	4			Yes	YES!	bash	AWS, ORCA	5	5	5	NA
supercalifragilisticexpialidocious	Too Long	4	4	4	3			Yes	YES!	python, nextflow, bash	We used ORCA, and resources from our home intitutions	3	2	5	absolutely nothing!
tantalizing, intensive, rewarding	Too Long	5	5	5	5			Yes	YES!	python, R; Optimization packages & blackbox implementations	Basic programming capabilities, CLI, Version control (Git/Github), experience with packages & libraries	4	3	4	N/A
	Too Short	4	4	5	4			Yes	YES!	python,bash,magicblast	AWS,ORCA	5	5	4
	Too Short	4	4	5	5			Yes	YES!	python, Bash	AWS servers and ORCA	3	5	1	None!
	Too Short	5	5	5	3			Yes	YES!	python	AWS 16 cores, 128 GB RAM	3	4	1	The resource is great, no need to ask for more

Dataset 2.Post-hackseq survey responses.

De-identified post-hackseq survey response data for the figures.

Technical and logistical requisites

Hackathons have little essential resource requirements. In this section, we outline the core logistics and technical infrastructure we employed. While these requisites could be stripped down, our experience was that attention and planning for these details maximized the efficiency of our teams to focus on coding and development.

Core logistics

To encourage participation, hackseq had no entry-cost for participants. To ensure teams could focus on the hackathon and not technical or logistical issues, we secured funding for the venue, technical infrastructure, food, transportation and stationery by partnering with different organizations.

A sponsorship package was created to approach different academic, non -profit and industry organizations. Besides asking for financial support, we also made communication and marketing requests given that one of hackseq’s goals was to recruit a diverse pool of participants. A strong emphasis was placed on women’s groups in science and technology.

In November 2015, we contacted ASHG to ask if we could be a satellite event for their meeting. Given that ASHG 2016 conference was planned to be held in Vancouver, hackseq gained exposure from the ASHG's communication strategy. The ASHG also provided three travel grants to participants based on financial need and diversity.

These partnerships allowed hackseq to take place in a large, bright atrium at the University of British Columbia. This allowed all the teams to be in a single-space and interact with one another. Food was provided to minimize distraction and two social events were hosted, one the first night and one on the last night to encourage collaboration and networking amongst participants.

Technical infrastructure

Reliable technical infrastructure is necessary for organizing a successful hackathon; primarily, electrical power, Internet access and computing resources. We ensured the venue had adequate electrical outlets for the participants’ laptops and organized a dedicated Wi-Fi network connection be established for the event through the university's information technology office.

Unlike many hackathons, hackseq was not restricted to coding. It also included genomic data analyses, which required additional computational resources. To promote reproducibility and collaboration, all the projects were based on pre-organized GitHub teams and repositories (see hackseq organization on Github; https://github.com/hackseq). To provide teams with reliable and powerful computation, we secured in-kind donation of cloud computing from Amazon Web Services Elastic Compute Cloud (AWS-EC2), and Canada’s Michael Smith Genome Science Centre genOmics Research Container Architecture (ORCA). We used Linuxbrew, a cross-platform package management tool, to install bioinformatics software on ORCA⁵.

There was an equal usage of AWS-EC2 and ORCA amongst the participants (43%, not mutually exclusive) and an additional 12% using high-performance computing resources from their resident institutions. Users showed a preference for resources they were previously experienced with, and reported that it was not feasible to learn to use a new computing resource in the given time. Allowing team leaders and participants access to computing resources ahead of time in the future to ‘experiment’ and familiarize themselves with the different resources is advisable.

Each team chose which programming language and software they used. The majority of participants relied on Python (82.6%) and R (53.8%) programming languages and also used specialized software that related to their particular projects (Figure 2).

Figure 2. Software usage during hackseq 16.

At the conclusion of hackseq, we asked participants to complete a survey on their experience at hackseq. There were 52 responses to the question, “Which programming languages and tools did you and your team use during the course of hackseq? (Comma delimited please).” These responses were parsed and the number of unique instances is reported. Languages or software listed <2 is reported as ‘Other’.

In summary, the infrastructure requisites for running a successful hackathon are minimal and many can be acquired as in-kind donations from related organizations. In highlighting the essentials and key lessons, we hope to encourage the motivated reader to run a local scientific hackathon.

Research project summaries

The projects undertaken during hackseq were from diverse fields within bioinformatics, ranging from human genomic variation analysis, microbial ecology and transcriptomics, to bioinformatic algorithm development. The projects were proposed by the team leaders, who defined the scope of the work, with the idea that at the end of the 72 hours there will have developed a working prototype. At hackseq, the teams organized organically, with team leaders defining the problem and teaching the participants the necessary background while participants offered their expertise on how to implement a solution. In this way a collaborative project goal could be explicitly defined (and re-defined), and the teams would work towards completing that goal together.

Here we provide brief summaries from the projects. Scientific abstracts, videos of final presentations and updated information on each project can be found at www.hackseq.com/projects16.

VASCO: Visualization app for single cell exploration (led by Grace X.Y. Zheng)

Modern transcriptomics analysis tools have limited capacity for analyzing thousands of single-cell RNA-sequencing data (scRNA-seq). VASCO is an intuitive user-interface to visualize gene-cell expression and cell clustering data to explore the relationship between populations of cells and gene expression, including cell cluster of differentiation markers (CD-markers). This project was awarded the “People’s Choice” for the most outstanding project developed at hackseq.

XYalign: Hacking sex chromosome variation (led by Melissa A. Wilson Sayres)

Human sex chromosomes violate typical ploidy assumptions made for NGS autosome copy number and variant measurement, which is further confounded by mis-alignment between the X and Y chromosomes. XYalign was developed to measure sex chromosome ploidy and remap reads based on the inferred sex for downstream analysis.

ParetoParrot: A tool to optimize the parameters of command line software (led by Shaun Jackman)

Many bioinformatics software, such as genome sequence alignment and assembly, requires optimization of several input parameters to maximize a target metric. ParetoParrot measured the performance of several ‘black-box’ optimization algorithms to improve the performance of genome sequence assembly software.

BaklavaWGS: Pseudo-WGS variant calling for common cell types aggregating ChIP-seq, RNA-seq and DHS from ENCODE and Roadmap Epigenomics data (led by Luca Pinello)

There is a wealth of sequencing datasets for cell types that have helped to understand and prioritize non-coding variants. Unfortunately, for many of those cell types we still don't have complete genotype information. BaklavaWGS recovers genotype data from cell lines aggregating sequencing data to aid downstream allele specific analysis. A preliminary analysis is available at http://www.baklavawgs.com/.

Evaluating epigenetic modifications in ChIP-seq and methylation data across cell types and states (led by Manuel Belmadani)

A variety of datasets and approaches were investigated for analyzing cell type and state-specific genome regulation. The outcome of the experimental work in exploring differentially methylated regions from different epigenomic data and public databases, such as ENCODE ChIP-seq, IHEC and JASPAR, is presented.

Selection of tag SNPs for an African SNP array by LD and haplotype based methods (led by Tommy Cartensen)

Commercial SNP arrays fail to capture the diversity of African populations and limit the capacity to conduct large-scale medical genetic studies. Using African whole genome sequencing (WGS) data, an algorithm was developed to quickly identify SNP tags for this population. This will be used to improve upon SNP arrays for this richly diverse continent.

Somatic mutation from separated haplotypes (SMUSH) (led by Patrick Marks)

Calling somatic mutations relies on matched tumour and normal DNA sequencing, but a matched normal sample is often not available. The SMUSH algorithm was developed to differentiate wild type, germline and somatic mutations from linked-read DNA sequencing libraries.

MetaGenius (led by Michael Schnall-Levin)

Analysis of shotgun metagenomic sequencing data is limited in its capacity to assemble over homologous sequences. MetaGenius uses linked-read DNA sequencing to improve the assembly of a mixture of five bacterial species.

mICP: Metagenomic indicator contig predictor (led by Ben Busby)

Metagenomic sequencing has largely focused on 16S rRNA amplicons. This mICP strategy uses a mixture of long PacBio and short Illumina reads to identify contigs from environmental sequencing samples, which predict the environmental state from which they were found.

Discussion

The overarching themes of hackseq were inclusivity, open science and collaboration. To gauge the extent to which we were successful in delivering on these themes, we performed a final survey at the conclusion of hackseq. Participants overwhelmingly described their experience as positive (Figure 3), with 100% (n = 54) of the survey respondents indicating that they would participate in an event like hackseq again and a further 80% indicating that they would like to take on an organizational/leadership role in future hackseq events. Participants specifically highlighted that hackseq created ample recruitment, employment and collaborative opportunities, while also exposing participants to different datasets and analyses. We believe this reflects the underlying desire amongst young scientists to share, collaborate and learn from one another. They only need be given the opportunity to do so.

Figure 3. Quantification of subjective experience.

To measure the quality of the experience hackseq participants had after the event, we asked (A) “Please write three single word adjectives to describe your experience at hackseq?” Responses were parsed and used to make a word-cloud (www.wordle.net), where the size of the word is proportional to the number of occurrences of that word in the survey responses. For scale, in 50 responses: ‘fun’ was mentioned 26 times; ‘exciting’ 6 times; and ‘supercalifragilisticexpialidocious’ once. (B) Additionally, we asked participants to rate four dimensions of their experience on a linear scale from 1 to 5. The kernel density of responses for these dimensions are shown, with a red dotted line showing the mean value of the responses.

By organizing hackseq as a satellite meeting of an international conference like ASHG, we were able to attract team leaders and participants from around the world, including a large proportion of young investigators and female participants (Figure 1). There was a higher proportion of females at hackseq (35.5%), than reported ratios at hackathons for which data is available, 20% at NASA’s Space Apps Challenge (https://www.fastcompany.com/3059036/most-creative-people/what-do-women-want-at-hackathons-nasa-has-a-list) or 15% at Spotify-organized hackathons (https://labs.spotify.com/2015/01/13/diversify-how-we-created-a-hackathon-with-50-50-female-male-participants/), which we believe to be a consequence of starting with a representative organizing committee and specifically encouraging female participation during recruitment. Although, this comparison is confounded by differences in starting demographics between computer science/engineering students and bioinformatics/biology students.

To further increase global representation at future hackseq events, we recommend providing additional targeted travel awards or remote participation options to reduce proximity/cost restrictions. Further improvements could include educational resources to address common technical issues, the provision of an overnight area for participants who would like to continue to work after hours and additional activities to encourage interaction with members from different teams.

Conclusion

The nature of biological sciences has shifted to an increasing emphasis on computational analysis. Collaborative events, such as hackseq, offer an exciting platform to bring together a wide spectrum of scientists to work together and innovate. We present demographic information about the first hackseq hackathon and encourage future organizers to do likewise, to quantify social inequalities that may be present in such events, and strive to achieve equal representation in the sciences. It’s our hope that the information presented here will aid and encourage others in organizing genomics hackathons.

Data availability

Dataset 1: hackseq demographics: De-identified demographic data from hackseq participants in the pre-meeting survey/confirmation of attendance. doi, 10.5256/f1000research.10964.d152802⁶

Dataset 2: Post-hackseq survey responses: De-identified post-hackseq survey response data for the figures. doi, 10.5256/f1000research.10964.d152803⁷

Author contributions

All members of the hackseq Organizing Committee 2016 contributed equally to hackseq and participated in the discussions expressed in this manuscript:

Artem Babaian, Terry Fox Laboratory, BC Cancer Agency, Vancouver, BC, Canada

Britt Drögemöller, Faculty of Pharmaceutical Sciences, University of British Columbia

Bruno M Grande, Department of Molecular Biology and Biochemistry, University of British Columbia

Shaun D Jackman, BC Cancer Agency Genome Sciences Centre

Amy Huei-Yi Lee, Department of Microbiology and Immunology, University of British Columbia

Santina Lin, Bioinformatics Training Program, University of British Columbia

Catrina Loucks, Department of Molecular Biology and Biochemistry, Simon Fraser University

Adriana Suarez-Gonzalez, Department of Botany, University of British Columbia

Tiffany Timbers, Masters of Data Science & Department of Statistics, University of British Columbia

Galen Wright, Centre for Molecular Medicine and Therapeutics, BC Children's Hospital Research Institute, University of British Columbia

AB, BD, BG, AL, SL, ASG and GW wrote the first draft of the manuscript. All authors were involved in the revision of the manuscript and agreed to the final version.

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Acknowledgements

We would first and foremost thank the hackseq participants without which this event would not have happened. By team we’d like to thank, Jean-Christophe Berube, Ogan Mancarci, Erin Marshall, Edward Mason, Celia Siu, Ben Weisburd, Shing Hei Zhan, Grace X.Y. Zheng; Madeline Couse, Bruno Grande, Eric Karlins, Tanya Phung, Phillip Richmond, Timothy H. Webster, Whitney Whitford, Melissa A. Wilson Sayres; Craig Glastonbury, Daisie Huang, Hamid Younesy, Jasleen Grewal, Laura Gutierrez Funderburk, Lisa Bang, Shaun Jackman, Veera Manikandan Rajagopal, Y. Brian Lee; Carolyn Ch'ng, David Brazel, Karthigayini Sivaprakasam, Jill Moore, Shobhana Sekar, Stephen Kan, Jing Yun Alice Zhu, Ka-Kyung Kim, Luca Pinello; Fotis Tsetsos, Kieran O'Neill, Shreejoy Tripathy, Manuel Belmadani; Ayton Meintjes, Scott Hazelhurst, Vincent Montoya, Marcia MacDonald, Jocelyn Lee, Dan Fornika, Brian Lee, Austin Reynolds, Tommy Carstensen; Amanjeev Sethi, Eric Zhao, Hua Ling, Patrick Marks, Peng Zhang, Samantha Kohli; Erik Gafni, Dan Kvitek, Jake Lever and Michael Schnall-Levin; Ben Busby, Justin Chu, Jessica Hardwicke, Sean La and Feng Xu.

We would like to thank our sponsorship partners 10X Genomics, ECOSCOPE, Amazon AWS, American Society of Human Genetics, Vancouver Tourism, Genome British Columbia, Association for Computing Machinery – Women, Affymetrix, bioinformatics.ca, and GitHub. We also partnered with local organizations for logistical support: Society for Canadian Women in Science and Technology, BC Cancer Agency Graduate Student and Post-Doctoral Fellow Society, National Center for Biotechnology Information and the Vancouver Bioinformatics User Group. Sponsorship partners had no role in data collection and analaysis or preperation of the manuscript. Thanks to the reviewers for their time and suggestions.

Faculty Opinions recommended

References

1. Stephens ZD, Lee SY, Faghri F, et al.: Big Data: Astronomical or Genomical? PLoS Biol. Public Library of Science; 2015; 13(7): e1002195. PubMed Abstract | Publisher Full Text | Free Full Text
2. Prins P, de Ligt J, Tarasov A, et al.: Toward effective software solutions for big biology. Nat Biotechnol. 2015; 33(7): 686–687. PubMed Abstract | Publisher Full Text
3. Busby B, Lesko M; August 2015 and January 2016 Hackathon participants, et al.: Closing gaps between open software and public data in a hackathon setting: User-centered software prototyping [version 2; referees: not peer reviewed]. F1000Res. 2016; 5: 672. PubMed Abstract | Publisher Full Text | Free Full Text
4. Richard GT, Kafai YB, Adleberg B, et al.: StitchFest: Diversifying a College Hackathon to Broaden Participation and Perceptions in Computing. Proceedings of the 46th ACM Technical Symposium on Computer Science Education - SIGCSE ’ 15. New York, USA: ACM Press; 2015; 114–119. Publisher Full Text
5. Jackman S, Birol I: Linuxbrew and Homebrew for cross-platform package management [version 1; not peer reviewed]. F1000Res. 2016; 5(ISCB Comm J): 1795 (poster). Publisher Full Text
6. hackseq Organising Committee (2016): Dataset 1 in: hackseq: Catalyzing collaboration between biological and computational scientists via hackathon. F1000Research. 2017. Data Source
7. hackseq Organising Committee (2016): Dataset 2 in: hackseq: Catalyzing collaboration between biological and computational scientists via hackathon. F1000Research. 2017. Data Source

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 28 Feb 2017

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (2)

version 2

Revised

Published: 10 Apr 2017, 6:197

https://doi.org/10.12688/f1000research.10964.2

version 1

Published: 28 Feb 2017, 6:197

https://doi.org/10.12688/f1000research.10964.1

© 2017 hackseq Organizing Committee 2016. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

SEE MORE DETAILS

CITE

how to cite this article

hackseq Organizing Committee 2016. hackseq: Catalyzing collaboration between biological and computational scientists via hackathon [version 2; peer review: 2 approved]. F1000Research 2017, 6:197 (https://doi.org/10.12688/f1000research.10964.2)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 2

VERSION 2

PUBLISHED 10 Apr 2017

Revised

Views

Reviewer Report 08 May 2017

Kate L. Hertweck, Department of Biology, The University of Texas at Tyler, Tyler, TX, USA

Approved

https://doi.org/10.5256/f1000research.12239.r21700

My comments ... Continue reading

CITE

Report a concern

Respond or Comment

Views

Reviewer Report 13 Apr 2017

Jiarong Guo, Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA

Approved

https://doi.org/10.5256/f1000research.12239.r21699

My previous comments have been addressed by authors in the new version ... Continue reading

CITE

Report a concern

Respond or Comment

Version 1

VERSION 1

PUBLISHED 28 Feb 2017

Views

Reviewer Report 31 Mar 2017

Jiarong Guo, Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA

Approved

https://doi.org/10.5256/f1000research.11818.r20964

The authors reported a detailed summary of their genomic hackathon, hackseq, to help those interested in organizing similar hackathons in future. The hackseq brought together 66 biological and computational scientists with diverse demographic background to collaborate on nine projects on genomics ranging from data visualization to algorithm development. All the participants had positive responses in post assessment and showed interests in future hackathons.

The background is clearly articulated. There are detailed descriptions on hackseq format and technical and logistical requisites, which are useful for future hackathons. Brief research project summaries are also described with more information available on GitHub. The data for reproducing the figures are made available on F1000 and schedules and application forms are available on GitHub.

Major comments:
Strengths:
Overall, sharing details and experiences of the hackseq such as recruiting project leaders and participants, assigning teams, logistical and technical requisites, and post assessment is valuable for the open science community to organize future hackathons.

It is a great idea to organize the hackathon as a satellite event of bigger events. The bigger events can help on travel cost of participants and more importantly promote hackathon participations.
This hackseq is very successful at recruiting diverse participants, because it has a representative committee that encourages female, minority and early career scientist participations and also good promotion strategies that it partners with organizations such as Society for Canadian Women in Science.

Weakness:
The team leaders seem to have a critical role in each project, but their roles and responsibilities during the hackathon are not clearly mentioned in the manuscripts.

Minor comments:
Last paragraph in “Hackseq format” section: it states that participants ranked top three projects on which they would like to work, but there are four projects of choice in the application form.
First paragraph in “Core logistics” section: stationary -> stationery.
Second paragraph in “Discussion”: female percentage in biology is significantly higher than in engineering. Thus the direct comparison of female participation rate of hackseq to engineering (NASA and Spotify) hackathons is not meaningful.
Discussion: assessment is difficult with participants from diverse background. Some discussion on the current assessment and possible improvement would be useful.

Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

CITE

Report a concern

Reader Comment 03 Apr 2017

Shaun Jackman, BC Cancer Agency Genome Sciences Centre, Canada

03 Apr 2017

Reader Comment

Thank you for your review, Jiarong.

> The team leaders seem to have a critical role in each project, but their roles and responsibilities during the hackathon are not clearly mentioned ... Continue reading Thank you for your review, Jiarong.

> The team leaders seem to have a critical role in each project, but their roles and responsibilities during the hackathon are not clearly mentioned in the manuscripts.

I helped organize Hackseq and was also a project leader. Key responsibilities of the team leaders are:

Prior to the event

1. Describing the proposed project.
2. Describing to the organizers:
1. desired number of participants and skill set
2. required compute resources
3. required software
4. required data
5. logistical requirements
3. Discussing the project with interested participants.
4. Discussing the suitability of the participants assigned to the project with the organizers.
5. Confirming that the required software is installed and works.
6. Downloading the required data.
7. Planning the scope and strategy of the project.
8. Dividing the project into separable components.

During the event

1. Introducing the participants to each other.
2. Introducing the project and necessary background information to the participants.
3. Describing the components of the project to the participants.
4. Assigning those components to participants based on their interest.
5. Periodically discussing progress with the participants.
6. Troubleshooting technical issues with the help of organizers when needed.
7. Organizing the final report and presentation.
Thank you for your review, Jiarong.

> The team leaders seem to have a critical role in each project, but their roles and responsibilities during the hackathon are not clearly mentioned in the manuscripts.

I helped organize Hackseq and was also a project leader. Key responsibilities of the team leaders are:

Prior to the event

1. Describing the proposed project.
2. Describing to the organizers:
1. desired number of participants and skill set
2. required compute resources
3. required software
4. required data
5. logistical requirements
3. Discussing the project with interested participants.
4. Discussing the suitability of the participants assigned to the project with the organizers.
5. Confirming that the required software is installed and works.
6. Downloading the required data.
7. Planning the scope and strategy of the project.
8. Dividing the project into separable components.

During the event

1. Introducing the participants to each other.
2. Introducing the project and necessary background information to the participants.
3. Describing the components of the project to the participants.
4. Assigning those components to participants based on their interest.
5. Periodically discussing progress with the participants.
6. Troubleshooting technical issues with the help of organizers when needed.
7. Organizing the final report and presentation.
Competing Interests: No competing interests were declared. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Reader Comment 03 Apr 2017

Shaun Jackman, BC Cancer Agency Genome Sciences Centre, Canada

03 Apr 2017

Reader Comment

Thank you for your review, Jiarong.

> The team leaders seem to have a critical role in each project, but their roles and responsibilities during the hackathon are not clearly mentioned ... Continue reading Thank you for your review, Jiarong.

> The team leaders seem to have a critical role in each project, but their roles and responsibilities during the hackathon are not clearly mentioned in the manuscripts.

I helped organize Hackseq and was also a project leader. Key responsibilities of the team leaders are:

Prior to the event

1. Describing the proposed project.
2. Describing to the organizers:
1. desired number of participants and skill set
2. required compute resources
3. required software
4. required data
5. logistical requirements
3. Discussing the project with interested participants.
4. Discussing the suitability of the participants assigned to the project with the organizers.
5. Confirming that the required software is installed and works.
6. Downloading the required data.
7. Planning the scope and strategy of the project.
8. Dividing the project into separable components.

During the event

1. Introducing the participants to each other.
2. Introducing the project and necessary background information to the participants.
3. Describing the components of the project to the participants.
4. Assigning those components to participants based on their interest.
5. Periodically discussing progress with the participants.
6. Troubleshooting technical issues with the help of organizers when needed.
7. Organizing the final report and presentation.
Thank you for your review, Jiarong.

> The team leaders seem to have a critical role in each project, but their roles and responsibilities during the hackathon are not clearly mentioned in the manuscripts.

I helped organize Hackseq and was also a project leader. Key responsibilities of the team leaders are:

Prior to the event

1. Describing the proposed project.
2. Describing to the organizers:
1. desired number of participants and skill set
2. required compute resources
3. required software
4. required data
5. logistical requirements
3. Discussing the project with interested participants.
4. Discussing the suitability of the participants assigned to the project with the organizers.
5. Confirming that the required software is installed and works.
6. Downloading the required data.
7. Planning the scope and strategy of the project.
8. Dividing the project into separable components.

During the event

1. Introducing the participants to each other.
2. Introducing the project and necessary background information to the participants.
3. Describing the components of the project to the participants.
4. Assigning those components to participants based on their interest.
5. Periodically discussing progress with the participants.
6. Troubleshooting technical issues with the help of organizers when needed.
7. Organizing the final report and presentation.
Competing Interests: No competing interests were declared. Close
Report a concern

Views

Reviewer Report 31 Mar 2017

Kate L. Hertweck, Department of Biology, The University of Texas at Tyler, Tyler, TX, USA

Approved

https://doi.org/10.5256/f1000research.11818.r20626

Thank you to the authors for writing a summary of what seems to be a very successful collaborative coding event, with this manuscript in particular focused on preparation for the event, managing logistic concerns during the event, and an overview of the projects supported. The manuscript is quite well written, and I have no concerns about the content presented therein.

I especially appreciate the recommendations for how to solicit diverse leaders/participants, engage with partner organizations, and carefully craft a sense of community among attendees. Moreover, the authors include suggestions on how to improve similar events in the future. The data reported here provide an important context for comparison for events which continue to encourage participation from underrepresented groups.

Although not highlighted in the paper, the itinerary for the three-day meeting described here includes a number of additional details which would be useful to other coding event organizers. For example, while the majority of meeting time was dedicated to team work, extra workshops and talks were offerred (e.g., introduction to git) that would be encourage skills development for students or other participants new to the field. I'll be interested to see whether this model for hosting a hackathon continues at ASHG or other meetings.

Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 28 Feb 2017

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2
Version 2 (revision) 10 Apr 17	read	read
Version 1 28 Feb 17	read	read

Kate L. Hertweck, The University of Texas at Tyler, Tyler, USA
Jiarong Guo, Michigan State University, East Lansing, USA

Comments on this article

All Comments(0)

Add a comment

Browse by related subjects

Back to all reports

Reviewer Report

9 Views

08 May 2017 | for Version 2

Kate L. Hertweck, Department of Biology, The University of Texas at Tyler, Tyler, TX, USA

9 Views Cite this report Responses(0)

Approved

My comments have been accommodated.

Competing Interests

No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

6 Views

13 Apr 2017 | for Version 2

Jiarong Guo, Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA

6 Views Cite this report Responses(0)

Approved

My previous comments have been addressed by authors in the new version and also in comment section. I have no further comments to make.

Competing Interests

No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

11 Views

31 Mar 2017 | for Version 1

Jiarong Guo, Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA

11 Views Cite this report Responses(1)

Approved

It is a great idea to organize the hackathon as a satellite event of bigger events. The bigger events can help on travel cost of participants and more importantly promote hackathon participations.
This hackseq is very successful at recruiting diverse participants, because it has a representative committee that encourages female, minority and early career scientist participations and also good promotion strategies that it partners with organizations such as Society for Canadian Women in Science.

Competing Interests

No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (1)

Reader Comment

03 Apr 2017

Shaun Jackman, BC Cancer Agency Genome Sciences Centre, Canada

Thank you for your review, Jiarong.

> The team leaders seem to have a critical role in each project, but their roles and responsibilities during the hackathon are not clearly mentioned in the manuscripts.

I helped organize Hackseq and was also a project leader. Key responsibilities of the team leaders are:

Prior to the event

1. Describing the proposed project.
2. Describing to the organizers:
1. desired number of participants and skill set
2. required compute resources
3. required software
4. required data
5. logistical requirements
3. Discussing the project with interested participants.
4. Discussing the suitability of the participants assigned to the project with the organizers.
5. Confirming that the required software is installed and works.
6. Downloading the required data.
7. Planning the scope and strategy of the project.
8. Dividing the project into separable components.

During the event

1. Introducing the participants to each other.
2. Introducing the project and necessary background information to the participants.
3. Describing the components of the project to the participants.
4. Assigning those components to participants based on their interest.
5. Periodically discussing progress with the participants.
6. Troubleshooting technical issues with the help of organizers when needed.
7. Organizing the final report and presentation.

View more View less

Competing Interests

No competing interests were declared.

Back to all reports

Reviewer Report

23 Views

31 Mar 2017 | for Version 1

Kate L. Hertweck, Department of Biology, The University of Texas at Tyler, Tyler, TX, USA

23 Views Cite this report Responses(0)

Approved

Competing Interests

No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

Alongside their report, reviewers assign a status to the article:

Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions

Click here to access the data.

Downloaded data do not display as expected? Download the data (0.44KB)

Click here to access the data.

Downloaded data do not display as expected? Download the data (13.83KB)

[1] 1. Stephens ZD, Lee SY, Faghri F, et al.: Big Data: Astronomical or Genomical? PLoS Biol. Public Library of Science; 2015; 13(7): e1002195. PubMed Abstract | Publisher Full Text | Free Full Text

[2] 2. Prins P, de Ligt J, Tarasov A, et al.: Toward effective software solutions for big biology. Nat Biotechnol. 2015; 33(7): 686–687. PubMed Abstract | Publisher Full Text

[3] 3. Busby B, Lesko M; August 2015 and January 2016 Hackathon participants, et al.: Closing gaps between open software and public data in a hackathon setting: User-centered software prototyping [version 2; referees: not peer reviewed]. F1000Res. 2016; 5: 672. PubMed Abstract | Publisher Full Text | Free Full Text

[4] 4. Richard GT, Kafai YB, Adleberg B, et al.: StitchFest: Diversifying a College Hackathon to Broaden Participation and Perceptions in Computing. Proceedings of the 46th ACM Technical Symposium on Computer Science Education - SIGCSE ’ 15. New York, USA: ACM Press; 2015; 114–119. Publisher Full Text

[5] 5. Jackman S, Birol I: Linuxbrew and Homebrew for cross-platform package management [version 1; not peer reviewed]. F1000Res. 2016; 5(ISCB Comm J): 1795 (poster). Publisher Full Text

[6] 6. hackseq Organising Committee (2016): Dataset 1 in: hackseq: Catalyzing collaboration between biological and computational scientists via hackathon. F1000Research. 2017. Data Source

[7] 7. hackseq Organising Committee (2016): Dataset 2 in: hackseq: Catalyzing collaboration between biological and computational scientists via hackathon. F1000Research. 2017. Data Source

hackseq: Catalyzing collaboration between biological and computational scientists via hackathon

Abstract

Keywords

Revised Amendments from Version 1

Introduction

hackseq format

Figure 1. Participant diversity at hackseq 16.

Technical and logistical requisites

Core logistics

Technical infrastructure

Figure 2. Software usage during hackseq 16.

Research project summaries

VASCO: Visualization app for single cell exploration (led by Grace X.Y. Zheng)

XYalign: Hacking sex chromosome variation (led by Melissa A. Wilson Sayres)

ParetoParrot: A tool to optimize the parameters of command line software (led by Shaun Jackman)

BaklavaWGS: Pseudo-WGS variant calling for common cell types aggregating ChIP-seq, RNA-seq and DHS from ENCODE and Roadmap Epigenomics data (led by Luca Pinello)

Evaluating epigenetic modifications in ChIP-seq and methylation data across cell types and states (led by Manuel Belmadani)

Selection of tag SNPs for an African SNP array by LD and haplotype based methods (led by Tommy Cartensen)

Somatic mutation from separated haplotypes (SMUSH) (led by Patrick Marks)

MetaGenius (led by Michael Schnall-Levin)

mICP: Metagenomic indicator contig predictor (led by Ben Busby)

Discussion

Figure 3. Quantification of subjective experience.

Conclusion

Data availability

Author contributions

Competing interests

Grant information

Acknowledgements

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

The problem

How to fix it

The problem

How to fix it

Competing Interests Policy

Stay Updated