ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Genome Note

Genomic and transcriptomic resources for the brown thornbill (Acanthiza pusilla) to support the conservation of a critically endangered subspecies

[version 1; peer review: 1 approved]
PUBLISHED 23 Apr 2024
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

Abstract*

The brown thornbill (Acanthiza pusilla) is a songbird endemic to eastern Australia with five recognised subspecies within the brown thornbill. The most notable is the King Island brown thornbill (Acanthiza pusilla magnirostris) of which there are less than 100 remaining and based on expert elicitation are the most likely Australian bird to become extinct in the next 20 years. We sequenced PacBio HiFi reads of the brown thornbill to generate a high-quality reference genome 1.25Gb in size and contig N50 of 20.1Mb. Additionally, we sequenced mRNA from three tissues to generate a global transcriptome to aid with genome annotation. The generation of a reference genome for the brown thornbill provides an important resource to align additional genomic data which will be produced in the near future.

Keywords

Genome assembly, reference genome, transcriptome, Aves, mitogenome

Introduction

The brown thornbill (Acanthiza pusilla) is a small species of songbird within the Acanthizidae family endemic to eastern and south-eastern Australia, including Tasmania (Higgins & Peter, 2002). There are five subspecies recognised within the brown thornbill including the Critically Endangered King Island brown thornbill (Acanthiza pusilla magnirostris). This taxon is considered the most likely Australian bird to become extinct within the next 20 years, based on expert elicitation (Geyle et al., 2018). Whilst the nominate brown thornbill is of least conservation concern, there are thought to be fewer than 100 King Island brown thornbills occurring on King Island (area 1098 km2), in the Bass Strait (Bell, Webb, Holdsworth, & Baker, 2023). Whilst surveys are ongoing, the King Island brown thornbill is understood to be restricted to patches of mature eucalypt forest on King Island, where it primarily forages in the canopy and in the crevices of bark on tree trunks (Bell et al., 2023).

The generation of a reference genome and associated transcriptomic data is a vital for informing genetic management of the King Island subspecies and can be used to align genetic data that will be produced in the near future. The genome is also the first for the genus Acanthiza, contributing to global efforts to sequence life on Earth (Lewin et al., 2022).

To facilitate detailed genomic research on this species, we sequenced DNA with PacBio HiFi long reads to generate a high-quality reference assembly and sequenced RNA from three tissues to provide transcriptomic resources and assist in genome annotation for the brown thornbill.

Methods

Sample collection and DNA/RNA extraction

A single wild male brown thornbill (B974_KIBT) from Tasmania was captured using a mist net and euthanised for genome and transcriptome generation under Australian National University Animal Experimental Ethics Committee program of wildlife authorisation approval number #A2021/33 (approval date (13/07/2021) and Tasmanian Scientific Permit #TFA23010. Every effort was made to reduce suffering of animals, including (i) collecting the minimum number of animals required for the study (one); (ii) pre-arranging animal euthanasia with a qualified veterinarian; (iii) collection of the animal from as close to the location of the veterinarian as practically possible to minimise transportation time; and (iv) transportation in a soft, dark material ‘bird bag’ to minimise stress during transportation. Tissue samples were dissected and flash frozen at -80°C or preserved in RNA later before being frozen at -80°C. High molecular weight (HMW) DNA was then extracted from kidney tissue using the Nanobind Tissue Big DNA Kit v1.0 (Circulomics). A Qubit fluorometer was used to assess the concentration of DNA with the Qubit dsDNA BR assay kit (Thermo Fisher Scientific). Total RNA was extracted from liver, brain and gonads using the RNeasy Tissue Kit (Qiagen) with RNAse-free DNAse I set (Qiagen). RNA quality was determined using the NanoDrop (Thermo Fisher Scientific) and RNA integrity (RIN) score determined using the Bioanalyzer RNA nano 6000 kit (Agilent 2100).

Library construction and sequencing

HMW DNA was sent for Pacific Biosciences High Fidelity (PacBio HiFi) library preparation with the SMRTbell Express Template Prep Kit 2.0 (Pacific Biosciences) and sequencing on one single molecule real-time (SMRT) cell of the PacBio Revio machine at the Australian Genome Research Facility (St Lucia, Australia). Total RNA from the liver, brain and gonads was sequenced as 150 bp paired-end (PE) reads using an Illumina Novaseq X with Illumina Stranded mRNA library preparation at the Ramaciotti Centre for Genomics (University of New South Wales, Kensington, Australia).

Genome assembly

The genome assembly was conducted on Galaxy Australia (The Galaxy Community, 2022) public server usegalaxy.org.au (Afgan et al., 2016) running the Genome assembly with ‘hifiasm’ (RRID:SCR_021069) on Galaxy Australia workflow v2.1 (Price & Farquharson, 2022). Briefly, Picard (http://broad institute.github.io/picard) (Galaxy version 2.18.2.2; RRID:SCR_006525) SamToFastq, samtools (Danecek et al., 2021; Li et al., 2009) (Galaxy version 2.0.3; RRID:SCR_002105) flagstat and fastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc) (Galaxy version 0.72; RRID:SCR_014583) was used to convert BAM files to FASTQ and quality check the reads for input to Hifiasm (Cheng, Concepcion, Feng, Zhang, & Li, 2021; Cheng et al., 2022). Hifiasm (Galaxy version 2.1) was run on Galaxy Australia to assembly the genome. Basic genome assembly statistics were calculated using the stats.sh script in BBMap v37.98 (sourceforge.net/projects/bbmap/) (RRID:SCR_016965). Genome completeness was determined using Benchmarking Universal Single-Copy Orthologues (BUSCO; RRID:SCR_015008) v5.4.6 (Simao, Waterhouse, Ioannidis, Kriventseva, & Zdobnov, 2015) with both the vertebrata_odb10 (n = 3354) and aves_odb10 (n= 8338) lineages on Galaxy Australia. Genome completeness and base accuracy was also determined Merqury v1.3 (RRID:SCR_022964) (Rhie, Walenz, Koren, & Phillippy, 2020), implemented in the Genome assessment post assembly workflow on Galaxy Australia (Price, 2023). Repetitive elements of the genome were identified, classified and masked on a Pawsey Supercomputing Centre Nimbus cloud machine (256GB RAM, 64 vCPU, 3 TB storage) by building a database using RepeatModeler v2.0.1 (RRID:SCR_015027) (Flynn et al., 2020); repeats were then masked using RepeatMasker v4.0.9 (RRID:SCR_012954) (Smit, Hubley, & Green, 2013-2015) with the -nolow parameter to avoid masking low complexity repeats.

Mitochondrial assembly

The contig representing the mitochondrial genome was identified from the reference genome assembly using MitoHiFi v2 (Allio et al., 2020; Uliano-Silva et al., 2023) and visualised using Proksee (Grant et al., 2023). MitoHiFi identified the yellow thornbill (Acanthiza nana) as the most taxonomically closely related publicly available mitochondrial genome (KY994614.1), used to search for the brown thornbill mitochondrial genome.

Transcriptome assembly

Transcriptome assembly was performed on the University of Sydney’s High Performance Computer, Artemis. Raw transcriptome reads were quality assessed pre- and post-trimming with FastQC v0.11.8 (RRID:SCR_014583). Trimmomatic v0.39 (RRID:SCR_011848) (Bolger, Lohse, & Usadel, 2014) with the parameters SLIDINGWINDOW:4:5, LEADING:5, TRAILING:5 and MINLEN:25 and ILLUMINACLIP:2:30:10 with the TruSeq3-PE adapters was used to quality trim reads. The repeat masked genome was indexed and trimmed reads aligned using the -dta parameter with hisat2 v2.1.0 (RRID:SCR_015530) (Kim, Paggi, Park, Bennett, & Salzberg, 2019). Resulting sam files with converted to bam format and sorted using samtools v1.9 (Danecek et al., 2021; Li et al., 2009). Stringtie v2.1.6 (RRID:SCR_016323) (Pertea et al., 2015) was used to generate a GTF for each transcriptome. Stringtie v2.1.6 with the -merge parameter merged transcripts into a global transcriptome retaining only transcripts with a fragments per kilobase of exon per million mapped fragments (FPKM) > 0.1 and length > 30. CPC2 v2019-11-19 (Kang et al., 2017) was used to predict coding potential and only transcripts predicted to be coding were retained. TransDecoder v2.0.1 (https://github.com/TransDecoder/TransDecoder) (RRID:SCR_017647) was used to predict open reading frames in the global transcriptome with a minimum transcript length of 20. Transcriptome completeness was assessed using BUSCO v5.4.6 (Simao et al., 2015) with the vertebrata_odb10 (n= 3354) and aves_odb10 (n = 8338) lineages on Galaxy Australia.

Genome annotation

Genome annotation was performed using FgenesH++ v7.2.2 (Softberry; RRID:SCR_018928 (Solovyev, Kosarev, Seledsov, & Vorobyev, 2006)) using the longest open reading frame as predicted from the global transcriptome, non-mammalian settings and optimised parameters supplied with the American crow (Corvus brachyrhynchos) gene finding matrix. BUSCO v5.4.6 (Simao et al., 2015) in protein mode was run on Galaxy Australia to assess the completeness of the annotation with the vertebrata_odb10 (n = 3354) and aves_odb10 (n = 8338) lineages. The ‘genestats’ script (https://github.com/darencard/GenomeAnnotation) was used to obtain the average number of exons and introns and the average exon and intron length.

Results

Genome assembly

The hifiasm assembly of the brown thornbill from PacBio HiFi data resulted in a genome 1.25Gb in size consisting of 1,000 contigs and sequenced to a depth of 43x. The longest contig in the assembly is 97.7 Mb and the assembly has an N50 of 20.1 Mb and L50 of 17 (Table 1). The genome is also highly complete with 96.9% complete Aves BUSCOs present in the assembly (Table 1). Merqury analysis also indicated a high-quality genome with QV > 59 and 87.1% complete k-mers. The mitochondrial genome is 16,862 bp and contains 37 genes including 22 tRNAs and 13 genes and 2 rRNAs (Figure 1). Repeat masking identified 19.06% of the genome as repeats (Table 2), which is in a similar range to other bird species (Zhang et al., 2014).

Table 1. Genome assembly statistics of the brown thornbill (Acanthiza pusilla).

Metric
Assembly size (Gb)1.25
Number of contigs1000
Contig N50 (Mb)20.13
Contig N90 (Mb)9.19
Contig L5017
Contig L90114
Longest contig (Mb)97.74
GC content (%)43.67
Complete vertebrata_odb10 BUSCOs96.7% [Single Copy: 95.3%, Duplicated: 1.4%]
Fragmented vertebrata_odb10 BUSCOs1.4%
Missing vertebrata_odb10 BUSCOs1.9%
Complete aves_odb10 BUSCOs96.9% [Single Copy: 96.3%, Duplicated: 0.6%]
Fragmented aves_odb10 BUSCOs0.5%
Missing aves_odb10 BUSCOs2.6%
eccf86c6-efa1-4ea1-bb3e-48c029c1b622_figure1.gif

Figure 1. Mitochondrial genome of the brown thornbill (Acanthiza pusilla), purple sections indicate tRNAs in the mitogenome, blue arrows indicate the directionality of the genes of the mitogenome.

Table 2. Classification of repeat elements of the brown thornbill (Acanthiza pusilla) genome assembly.

Repeat elementNumber of elementsPercentage of sequence
SINEs28040.03
MIRs17500.02
LINES1395413.56
LINE17250.01
L3/CR11386253.54
LTR elements551083.78
ERVL244801.41
ERV Class I185491.51
ERV Class II99770.8
DNA elements128870.35
hAT-Charlie2270
Unclassified11843111.34
Total interspersed repeats19.06
Small RNA4900
Satellites76200.28
Simple repeats550

Transcriptome assembly and genome annotation

All individual tissues had alignments rates greater than 85% against the repeat masked reference genome (liver: 93.52%, brain: 91.75% and gonads: 89.24%). A total of 45,082 transcripts were predicted to have coding potential and 12,549 longest open reading frame transcripts were used as input for genome annotation with FgenesH. A total of 29,706 genes were predicted by the FgenesH annotation software, with the annotation containing 73.9% complete aves_obd10 BUSCOs (Table 3). There were an average number of 7.42 exons and 6.42 introns per gene (Table 3).

Table 3. Statistics of the global transcriptome and annotation of the brown thornbill (Acanthiza pusilla).

Metrics
Global Transcriptome
Complete vertebrata_odb10 BUSCOs94.1% [Single Copy: 42.5%, Duplicated: 51.6%]
Fragmented vertebrata_odb10 BUSCOs1.6%
Missing vertebrata_odb10 BUSCOs4.3%
Complete aves_odb10 BUSCOs90.6% [Single Copy: 39.5%, Duplicated: 51.1%]
Fragmented aves_odb10 BUSCOs1.2%
Missing aves_odb10 BUSCOs8.2%
Annotation
Complete vertebrata_odb10 BUSCOs69.3% [Single Copy: 67.1%, Duplicated: 1.6%]
Fragmented vertebrata_odb10 BUSCOs10.5%
Missing vertebrata_odb10 BUSCOs20.2%
Complete aves_odb10 BUSCOs73.9% [Single Copy: 73.3%, Duplicated: 0.6%]
Fragmented aves_odb10 BUSCOs5.7%
Missing aves_odb10 BUSCOs20.4%
Average number of exons per gene7.42
Average number of introns per gene6.42
Average exon length (bp)2246.65
Average intron length (bp)20551.18
Ethical considerations

Birds were sampled under Australian National University Animal Experimentation Ethics Committee program of wildlife authorisation approval number #A2021/33 (approval date 13/07/2021) and Tasmanian Scientific Permit #TFA23010.

Author contribution

Luke W. Silver

Roles: Data curation, formal analysis, investigation, software, methodology, Writing -original draft Preparation

Ross Crates

Roles: Conceptualization, Funding acquisition, data collection, administration, writing-original draft

Dejan Stojanovic

Roles: Data collection, Writing – Review & Editing

Catherine M. Young

Roles: Data collection, Writing – Review & Editing

Katherine Belov

Roles: Conceptualization, Funding Acquisition, Supervision, Writing – Review & Editing

Katherine A. Farquharson

Roles: Methodology, Supervision, Writing – Review & Editing

Rob Heinsohn

Roles: Administration, Supervision, Writing – Review & Editing

Carolyn J. Hogg

Roles: Conceptualization, Funding Acquisition, Project Administration, Supervision, Writing – Review & Editing

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 23 Apr 2024
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Silver LW, Crates R, Stojanovic D et al. Genomic and transcriptomic resources for the brown thornbill (Acanthiza pusilla) to support the conservation of a critically endangered subspecies [version 1; peer review: 1 approved]. F1000Research 2024, 13:337 (https://doi.org/10.12688/f1000research.145788.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 23 Apr 2024
Views
1
Cite
Reviewer Report 01 Aug 2024
Natalie Forsdick, Manaaki Whenua - Landcare Research, Auckland, New Zealand 
Approved
VIEWS 1
The Genome Note 'Genomic and transcriptomic resources for the brown thornbill (Acanthiza pusilla) to support the conservation of a critically endangered subspecies' provides a clear and brief account of the genome assembly and annotation of this species, which will be used ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Forsdick N. Reviewer Report For: Genomic and transcriptomic resources for the brown thornbill (Acanthiza pusilla) to support the conservation of a critically endangered subspecies [version 1; peer review: 1 approved]. F1000Research 2024, 13:337 (https://doi.org/10.5256/f1000research.159784.r297997)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 23 Apr 2024
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.