ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

A TALE of shrimps: Genome-wide survey of homeobox genes in 120 species from diverse crustacean taxa

[version 1; peer review: 2 approved, 1 approved with reservations]
PUBLISHED 17 Jan 2018
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

The homeodomain-containing proteins are an important group of transcription factors found in most eukaryotes including animals, plants and fungi. Homeobox genes are responsible for a wide range of critical developmental and physiological processes, ranging from embryonic development, innate immune homeostasis to whole-body regeneration. With continued fascination on this key class of proteins by developmental and evolutionary biologists, multiple efforts have thus far focused on the identification and characterization of homeobox orthologs from key model organisms in attempts to infer their evolutionary origin and how this underpins the evolution of complex body plans. Despite their importance, the genetic complement of homeobox genes has yet been described in one of the most valuable groups of animals representing economically important food crops. With crustacean aquaculture being a growing industry worldwide, it is clear that systematic and cross-species identification of crustacean homeobox orthologs is necessary in order to harness this genetic circuitry for the improvement of aquaculture sustainability. Using publicly available transcriptome data sets, we identified a total of 4183 putative homeobox genes from 120 crustacean species that include food crop species, such as lobsters, shrimps, crayfish and crabs. Additionally, we identified 717 homeobox orthologs from 6 other non-crustacean arthropods, which include the scorpion, deer tick, mosquitoes and centipede. This high confidence set of homeobox genes will now serve as a key resource to the broader community for future functional and comparative genomics studies.

Keywords

Crustacean, homeobox, TALE, comparative genomics, arthropod, homeodomain

Introduction

As one of the fastest growing industries, the seafood trade is dominated by fishing and farming of crustaceans, with annual sales exceeding $40 billion (Stentiford et al., 2012). Crustacean aquaculture is multi-faceted, not only contributing to the ever-increasing demands by international markets, but is also directly linked to the socio-economic aspects of many developing nations through the creation of jobs and infrastructure. Aquaculture practices have intensified in recent years to cope with the demand. Yet, many are not sustainable since the increased densities of farmed shrimps often serve as hotbeds for pathogens if left unabated, causing infectious diseases and the devastation of cultures resulting in massive financial losses. As a result, regulations associated with aquaculture diseases are being enforced with emphasis placed on preventative measures, e.g. enhancement of broodstock and research aiming to further our understanding on crustacean development and ways to utilize the innate ability of crustaceans to combat pathogens (Lai & Aboobaker, 2017; Stentiford et al., 2012).

Several conserved molecular genetic circuitries are well-known for regulating many aspects of development and innate immune homeostasis. One prominent example would be homeobox genes, a family of transcription factors defined by the presence of a homeodomain (Holland, 2013). As one of the most important master controls in development, some headway has already been made in understanding the involvement of homeobox genes in innate immunity; Caudal in Drosophila melanogaster is implicated in commensal-gut mutualism (Ryu et al., 2004; Ryu et al., 2008). Given their importance, major efforts have thus far focused on characterization of homeobox genes in well-known model organisms such as humans (Garcia-Fernàndez, 2005; Holland et al., 2007), Caenorhabditis elegans (Bürglin, 1997), D. melanogaster (Mukherjee & Bürglin, 2007), planarians (Currie et al., 2016; Felix & Aboobaker, 2010; Garcia-Fernandez et al., 1991), amphioxus (Luke et al., 2003), teleost fish (Mulley et al., 2006) and many more. Although homeobox orthologs have been previously studied in the crustacean Parhyale hawaiensis (Kao et al., 2016), systematic and cross-species characterization of this gene family across the broader Crustacea with focus on food crop species is currently lacking. A better understanding of homeobox genes in crustaceans is therefore required to address this major shortfall, leading us to our present work.

Methods

Transcriptome data sets and query sets

We retrieved complete transcriptome data sets for 120 crustacean species available at the time of manuscript preparation from the European Nucleotide Archive. Six non-crustacean arthropod proteomes were retrieved from Uniprot. A complete list of accessions used in this study is provided in Supplementary Table 1. We retrieved a list of query sequences used in subsequent homology searches from Uniprot and GenBank.

Identification of homeobox orthologs

Based on a previously published workflow (Lai & Aboobaker, 2017), we used multiple Basic Local Alignment Search Tool (BLAST)-based approaches, such as BLASTp and tBLASTn to identify genes with homeodomain sequences. The BLAST results were filtered by e-value of < 10-6, best reciprocal BLAST hits against the GenBank non-redundant (nr) database and redundant contigs having at least 95% identity were collapsed using CD-HIT. We then utilized HMMER (version 3.1) employing hidden Markov models (HMM) profiles (Finn et al., 2011) to scan for the presence of Pfam homeodomains (Bateman et al., 2004) on the best reciprocal nr BLAST hits, to compile a final non-redundant set of crustacean and arthropod homeobox gene orthologs (Dataset 1).

Multiple sequence alignment and phylogenetic tree construction

Multiple sequence alignment of homeodomain sequences was performed using MAFFT (version 7) (Katoh et al., 2009). Phylogenetic tree was built from the MAFFT alignment using RAxML WAG + G model to generate a best-scoring maximum likelihood tree (Stamatakis, 2014). Geneious (version 7) was used to generate a graphical representation of Newick tree (Kearse et al., 2012).

Results and discussion

Identification of putative homeobox genes in crustaceans

With the recent availability of a large number of transcriptome data sets, we perform an extensive search for homeobox genes from 120 crustacean species. We focus on species represented across the broader Crustacea sampling from three main crustacean classes, Malacostraca, Branchiopoda and Copepoda, with focus on key food crop species from the order Decapoda (Supplementary Table 1). Using BLAST-based approaches and profile HMM (Bateman et al., 2004; Finn et al., 2011; Finn et al., 2015) for homology searches, we conservatively identified 4183 transcripts with homeodomain sequences from crustaceans (Figure 1; Dataset 1). Additionally, we included six non-crustacean arthropod species in our search and from these species, we identified 717 homeobox orthologs (Figure 1; Dataset 1).

94b45d14-73c8-4fa3-9114-0497b770bbfc_figure1.gif

Figure 1. The homeobox superfamily in Crustacea and representative arthropod species.

(A) Number of homeobox gene orthologs identified in each species are depicted as boxplots, indicating the median and quartiles. Violin plots underlying the boxplots illustrate sample distribution across different crustacean taxa and kernel probability density (width of the shaded areas represent the proportion of data located in these areas). The homeobox gene orthologs from six non-crustacean species within Arthropoda (others) are also shown. (B) Bar charts illustrating the number of homeobox gene orthologs in crustaceans from Decapoda, Branchiopoda and Copepoda along with six non-crustacean arthropods (others).

Dataset 1.List of Pfam annotated homeobox genes and associated e-values in crustaceans and other arthropods.
Dataset 2.Fasta file for homeobox gene sequences in crustaceans and other arthropods.

Classification and phylogenetic analysis of TALE class genes

Concerted efforts to establish evolutionary classification of homeobox genes have resulted in 11 recognised classes (Edvardsen et al., 2005; Holland et al., 2007; Ryan et al., 2006; Zhong et al., 2008; Zhong & Holland, 2011). The Three-Amino acid-Loop Extension (TALE) superclass within the group of homeobox genes is characterized by three additional residues between alpha helices 1 and 2 of the homeodomain (Bertolino et al., 1995). TALE class homeodomain proteins are further divided into 6 subclasses, Meis, Pknox, Pbc, Irx, Mkx and Tgif characterized by distinct motifs beyond the homeodomain (Bürglin, 1997; Bürglin, 2005; Holland et al., 2007; Mukherjee & Bürglin, 2007). We have classified a total of 165 TALE class orthologs from 15 decapod crustacean species (Figure 2). These genes form distinct phylogenetic grouping, which allows confident assignment of decapod TALE class orthologs into 6 sub-families (Figure 2). Importantly, the tree topology of crustacean TALE class orthologs recapitulated observations from a previous study (Holland et al., 2007).

94b45d14-73c8-4fa3-9114-0497b770bbfc_figure2.gif

Figure 2. Phylogeny of TALE superclass orthologs in decapod crustaceans.

The tree was constructed using the maximum-likelihood method from an amino acid multiple sequence alignment, which include TALE class genes from other species (Zhong et al., 2008 and Zhong & Holland, 2011). TALE orthologs representing 6 subclasses are colour-coded. The node labels of each taxon are marked with distinctive colors denoted in the figure inset. Bootstrap support values (n=1000) are denoted as branch labels.

Conclusion

We identified 4900 homeodomain transcripts from 120 crustaceans and 6 non-crustacean arthropod species. Although this data set is non-exhaustive – transcriptomes contain only genes expressed at the point of sample collection – it will now serve as a key resource for future functional studies in the context of crustacean aquaculture. Beyond crustaceans, this work is widely applicable to studies on homeobox genes from other animals and will facilitate evolutionary and comparative genomics investigations.

Data availability

Dataset 1: List of Pfam annotated homeobox genes and associated e-values in crustaceans and other arthropods. DOI, 10.5256/f1000research.13636.d190417 (Chang & Lai, 2018).

Dataset 2: Fasta file for homeobox gene sequences in crustaceans and other arthropods. DOI, 10.5256/f1000research.13636.d190418 (Chang & Lai, 2018).

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 17 Jan 2018
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Chang WH and Lai AG. A TALE of shrimps: Genome-wide survey of homeobox genes in 120 species from diverse crustacean taxa [version 1; peer review: 2 approved, 1 approved with reservations]. F1000Research 2018, 7:71 (https://doi.org/10.12688/f1000research.13636.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 17 Jan 2018
Views
6
Cite
Reviewer Report 23 May 2018
Colleen Doherty, Department of Molecular and Structural Biochemistry, North Carolina State University, Raleigh, NC, USA 
Approved
VIEWS 6
In this manuscript, Chang and Lai identify sequences of the homeobox genes in crustaceans from transcriptional data. For TALE family members, they classify these orthologs into the six subfamilies. The introduction provides the justification for establishing this resource in these ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Doherty C. Reviewer Report For: A TALE of shrimps: Genome-wide survey of homeobox genes in 120 species from diverse crustacean taxa [version 1; peer review: 2 approved, 1 approved with reservations]. F1000Research 2018, 7:71 (https://doi.org/10.5256/f1000research.14814.r33514)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
12
Cite
Reviewer Report 13 Apr 2018
Nathan J Kenny, Department of Life Sciences, The Natural History Museum of London, Cromwell Road, London, SW7 5BD, UK 
Approved with Reservations
VIEWS 12
This work makes a putative assessment of the overall homeodomain complements of the transcriptomes of a number of crustacean species. One class of homeodomain containing genes (TALE) from one order of crustaceans (Decapoda) is assessed in detail, but otherwise no ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Kenny NJ. Reviewer Report For: A TALE of shrimps: Genome-wide survey of homeobox genes in 120 species from diverse crustacean taxa [version 1; peer review: 2 approved, 1 approved with reservations]. F1000Research 2018, 7:71 (https://doi.org/10.5256/f1000research.14814.r32836)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
19
Cite
Reviewer Report 14 Feb 2018
Ricardo M. Zayas, Department of Biology, San Diego State University (SDSU), San Diego, CA, USA 
Approved
VIEWS 19
The report by Chang and Lai provide an extensive dataset and phylogenetic analysis crustacean homeobox genes. This data will be useful to individuals interested in studying the evolution and function of homeobox genes in crustacea and other organisms. Overall, this ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Zayas RM. Reviewer Report For: A TALE of shrimps: Genome-wide survey of homeobox genes in 120 species from diverse crustacean taxa [version 1; peer review: 2 approved, 1 approved with reservations]. F1000Research 2018, 7:71 (https://doi.org/10.5256/f1000research.14814.r29917)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 17 Jan 2018
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.