Phylogeny and biogeography of the carnivorous plant family Droseraceae with representative Drosera species from Northeast India [version 1; peer review: 1 approved, 1 not approved]

Background: Botanical carnivory is spread across four major angiosperm lineages and five orders: Poales, Caryophyllales, Oxalidales, Ericales and Lamiales. The carnivorous plant family Droseraceae is well known for its wide range of representatives in the temperate zone. Taxonomically, it is regarded as one of the most problematic and unresolved carnivorous plant families. In the present study, the phylogenetic position and biogeographic analysis of the genus Drosera is revisited by taking two species from the genus Drosera (D. burmanii and D. Peltata) found in Meghalaya (Northeast India). Methods: The purposes of this study were to investigate the monophyly, reconstruct phylogenetic relationships and ancestral area of the genus Drosera, and to infer its origin and dispersal using molecular markers from the whole ITS (18S, 28S, ITS1, ITS2) region and ribulose bisphosphate carboxylase (rbcL) sequences. Results: The present study recovered most of the findings by previous studies. The basal position of Droseraceae within the non-carnivorous Caryophyllales indicated in the tree topologies and fossil findings strongly support a date of origin for Droseraceae during the Paleocene (55-65 mya). Within the family Droseraceae, the sister relationship between Aldrovanda and Dionaea is supported by our ITS and rbcL dataset. This information can be used for further comparative and experimental studies. Conclusions: Drosera species are best suited as model systems for addressing a wide array of questions concerning evolutionary dynamics and ecological processes governing botanical carnivory. Open Peer Review


Introduction
The carnivorous plant family Droseraceae is well known for its complex taxonomic diversity in temperate climatic regions. The family comprises nearly 200 species with two monotypic genera Aldrovanda and Dionaea and one large genus Drosera (popularly named as sundew) with a maximum number of species 1,2,3 . The name Drosera is derived from the Greek word meaning 'dewdrops'. These types of plants usually exhibit remarkable tolerance to high-stress habitats and have acquired adequate reproductive fitness on the evolutionary ladder for their survival 4 . Specialized carnivory traps common to all Drosera species are in fact highly modified leaves lined with mucilaginous glandular trichomes or tentacles. Drosera species mostly inhabit regions of the Southern hemisphere and Southwestern Australia. In India, Drosera species are found in some parts of the Northeastern region, Deccan peninsular region, Southern India and along regions in West Bengal 5,6 . Of the three known Drosera species (D. burmanii Vahl., D. indica L. and D. peltata Thund.) reported in India, two are found in Meghalaya i.e., D. burmanii and D. peltata 7 .
Drosera species can be grouped into five different habits depending on their growth forms, such as temperate sundews, pygmy sundews, subtropical sundews, tuberous sundews and the petiolaris complex. The diversity of growth forms in this genus is so vast that it comprises annual species forming hibernaculum in winter dormancy or underground tubers in extreme dry summers. The long tentacles on leaves are often brightly coloured and tipped with nectar secreting glands, adhesive compounds, as well as digestive enzymes. These tentacles start moving in to bring as many secretory glands as possible in contact with the prey upon capture. According to Darwin 8 , glandular formations present in Drosera leaves secrete proteolytic enzymes similar to those found in the animal stomach. It also demonstrates that the substances solubilized and decomposed by the action of enzymes are absorbed by plant foliage. In some species, (for example D. burmannii), the tentacle motion is quite remarkable as the glands can bend 180° in just fractions of a second.
Many Drosera species are best known for their valuable natural products. Secondary metabolites from sundews, such as 1,4-naphthoquinones and flavonoids, have significantly contributed to folklore medicinal practices worldwide 9 . There are reports in ancient literature describing medicinal usage of different species of Drosera in treating epilepsy 10 . Many species of Drosera are threatened in India due to their confined distribution and extensive usage in the herbal industry, and thus have been categorized as vulnerable by the International Union for Conservation of Nature 6,11,12 . Candolle proposed the first infrageneric classification of Drosera, with two recognized sections of the plant based upon the characteristics and morphology of their styles 13 . Later Seine and Barthlott 14 described three subgenera and 11 sections based on morphological, anatomical, palynological and cytotaxonomical studies. The phylogenetic study of Williams et al. 15 , based on ribulose bisphosphate carboxylase (rbcL) sequences and morphological data, could identify three major lineages within Drosera with subgenus regia emerging as the first branch, followed by subgenus capensis. Rivadavia et al. 16 made an attempt to understand Drosera systematics based on the rbcL and 18S regions. The study highlighted D. regia and D. arcturi to be basal species for Drosera.
The species distribution of the genus Drosera ranges from both the hemispheres with about ~80 species in Australia, ~30 species in Africa, including North Africa and South Africa, ~30 species in South America, and less than 10 species in North America and Eurasia 17 . The phylogeography is not merely an extension of phylogenetic principles to the intraspecific level, rather it describes the population strata by utilizing the information belied in geographical patterns of ancestral lineage across the range of a species 18 . Understanding the process of colonization and population divergence of this species is fundamental to the study of its evolutionary diversification. Previous studies based on rbcL markers 16 showed that the South American Drosera species arose from Australian species by dispersal, and the African species other than D. regia and D. indica arose subsequently from their ancestors in South America. Another study conducted by Rivadavia et al. 19 on multidisciplinary studies of D. meristocaulis, prevalent in Neblina highlands of northern South America, proposed a long-distance dispersal from Australia to South America. It was also found that the section Bryastrum diversified from its ancestor about 13-12 MYA and does not agree to the Gondwanan origin for the D. meristocaulis 19,20 . Rivadavia et al. 16 vouched for South African/ Australian origin of Drosera. Though the outcomes of their analysis could be attributed to Croizat and Gondwanan vicariance, the origin of Drosera is not supported by the recent studies on Droseraceae and their evolution 16 . It implies more work needs to be done to fully understand the evolution of the family Droseraceae and the genus Drosera, in particular.
In the present study, the phylogenetic position and biogeographic study of the genus Drosera is revisited, by representing the two species of genus Drosera (D. burmanii and D. Peltata) found in Meghalaya (Northeast India). The purposes of this study were (1) to investigate the monophyly of the genus Drosera, reconstruct phylogenetic relationships and ancestral area reconstruction of the genus Drosera in the family Droseraceae, (2) to infer the origin and dispersal of Drosera, and (3) to infer the phylogenetic relationships among Aldrovanda, Dionaea, and Drosera, using molecular markers from the whole ITS (18S, 28S, ITS1, ITS2) region and rbcL sequences.

Methods
Survey, collection and taxon sampling Insectivorous plant species in the genus Drosera were collected from different regions of Meghalaya, according to their present availability. The collected plants included two species of Drosera viz. D. peltata and D. burmannii (Figure 1 and Figure 2). Drosera burmanii Vahl. was collected from Jarain, Jaintia Hills District, Meghalaya (N 25°36′, E 92°15′) and Drosera peltata Sm. was collected from Cherrapunjee, East Khasi Hills District, Meghalaya (N 25°07′, E 91°28′). Identification of these insectivorous plants was carried out at the Botanical Survey of India (BSI), Eastern circle, Shillong, Meghalaya. Herbariums were prepared and submitted in BSI and Department of Botany, North-Eastern Hill University  (NEHU), Shillong. Specimen voucher numbers (NEHU) and accession numbers (BSI) of Drosera burmanii are 11924 and 86843; and Drosera peltata are 11962 and 86840, respectively. We amplified the whole ITS and rbcL regions from all the abovementioned plants for the proposed work. In addition, we collected GenBank data that included these markers from representative species belonging to the genus Nepenthes, Drosera, Aldrovanda, Dionaea and Sarracenia, along with their geographical distribution information (Table 1).

DNA extraction, PCR amplification and sequencing
For Drosera sp., the leaves and the stems were taken, washed thoroughly with water removing all the dirt and debris of insects, kept in 70% alcohol for a few minutes, dried, and then wrapped in aluminum foil and stored in liquid nitrogen for further use. Total genomic isolation from Drosera was carried out using DNeasy Plant Mini Kit (Qiagen, USA), according to the manufacturer's instructions with minor modifications (combination of a borate extraction buffer with the DNA extraction kit, and a proteinase K Phylogenetic analysis Maximum likelihood. All ITS and rbcL sequences were first aligned using MUSCLE 21 and subsequently concatenated using MESQUITE V3.03 22 . Highly variable sequence regions were excluded from analyses of the extended data set. Because initial separate calculations using noncoding spacer regions and coding matK sequences, yielded congruent but incompletely resolved topologies, respectively, both partitions were combined in all subsequent analyses. Maximum likelihood (ML) analyses were carried out using MEGA 7 23 . To find the best substitution model for our analyses, ML fits of 24 different nucleotide substitution models were performed. Models with the lowest BIC scores (Bayesian Information Criterion) are considered to describe the best substitution pattern. For each model, AICc value (Akaike Information Criterion, corrected), Maximum Likelihood value (lnL), and the number of parameters (including branch lengths) were also computed. The following models were verified for this study: General Time Reversible (GTR); Hasegawa-Kishino-Yano (HKY); Tamura-Nei (TN93); Tamura 3-parameter (T92); Kimura 2-parameter (K2); Jukes-Cantor (JC).
For ITS and rbcL dataset, the evolutionary history was inferred by using the ML method based on the Tamura-Nei model 24 . Sequence information for the aligned dataset pertaining to total number of sites (excluding sites with gaps/missing data), sites with alignment gaps or missing data, invariable (monomorphic) sites, G+C content, parsimony informative sites, number of haplotypes (h), haplotype gene diversity (Hd), Nucleotide diversity per site (Pi),  31 . Based on these studies and fossil data information on Droseraceae 31 , the following constraints were applied with a normal prior distribution that spanned the full range of nodal age estimates: the most recent common ancestor (MRCA) of Droseraceae (divergence between Drosera and Aldrovando species) was set with a minimum and maximum divergence time to 38 and 55, respectively; the MRCA of Drosera species was set to 22-5 MYA.
For biogeographic inference, Bayesian Binary MCMC (BBM) and Statistical Dispersal-Vicariance Analysis (S-DIVA) methods were employed in which biogeographic reconstructions were averaged over a sample of highly probable Bayesian trees 32 . In S-DIVA the occurrence of an ancestral range at a node was computed using all alternative reconstruction frequencies generated by the DIVA algorithm for each tree in the data set. To account for both phylogenetic and ancestral states uncertainty S-DIVA was utilized for an entire posterior distribution of trees. Different geographic areas of endemism for the carnivorous plants meant for this study consistent with the present distribution with both outgroup and in-group sampling are outlined in Table 1.

Phylogenetic analysis
The rbcL gene and ITS regions were separately aligned and ML and parsimonious trees were used for the phylogenetic analyses. The nucleotide sequences were aligned without any insertions or deletions. A total of 52 accessions from the Droseraceae family, including N. khasiana (Nepenthaceae) and S. flava (Sarraceniaceae) (used as out-groups), were considered for revisiting the Drosera phylogeny (Table 1). A separate concatenated dataset of ITS and rbcL were taken for Bayesian phylogeny reconstruction. The consensus core secondary structures of ITS regions for Drosera species as per their geographical distribution were drawn and shown in Figure 3. The ML tree was further used for time divergence studies.
Major clades within Drosera determined via phylogenetic analysis were subjected to relative rate tests using ML estimates of substitutions per site between taxon groups according to the model of best fit (GTR + Γ + I), determined with Modeltest in MEGA 7 23 . Relative rates were intended to provide evidence of whether certain longer branches within Drosera were the result of rate acceleration of individual species. Models with the lowest BIC scores were considered suitable for the analysis. For each model, AICc value, ML value and the number of parameters (including branch lengths) were also presented ( Table 4). Evolutionary rates among sites and their non-uniformity were modeled by using a discrete Gamma distribution (+G) with 5 rate categories. Estimates of gamma shape parameter and/or the estimated fraction of invariant sites are shown (Table 4). For each model assumed or estimated values of transition/transversion bias (R) are shown and are followed by nucleotide frequencies (f) and rates of base substitutions (r) for each nucleotide pair. Sum of r-values is made equal to 1 for each model.

TaxonGap analysis
A DNA barcode marker is judged by its resolving power to discriminate species at generic and infrageneric levels. The intra-and inter-specific sequence divergence amongst the candidate markers chosen for the present study showed a comparative pictorial barcode gap in form of taxon-plots for the marker candidates (ITS, matK, rbcL) for species representing the family Droseraceae. The results are summarized in Figure 6. For each species, sequence similarity of the same gene within the same species was high; therefore, the relevant intra-specific variation (shown as dark grey bars) was low. TaxonGap plots have the discriminatory power to gauge better marker barcodes when phylogenetic trees for multiple genes need to be compared. Moreover, it uses the same scaling for depicting distance values based on individual biomarkers, thus making it straightforward to evaluate multiple genes rather than the need for comparing separate gene trees drawn for each of the taxonomic units. In the present study, it emerged that a combination of     The grey and black bars represent the intra-and inter-specific variations, respectively. The thin, black lines denote the smallest inter-specific variation. Names appearing next to the dark bars denote the closest species to that listed on the left. different markers (rbcL+ITS+matK) would render them better discriminatory power for identifying species in the carnivorous plant diversity.

Molecular divergence time estimates
Several genera of Droseraceae have been blessed with fossil pollens. A single record called Fischeripollis from European Mid Miocene has been assigned to Dionaea 33 . Fossil seed information and even leaves have contributed to the understanding of Drosera origin on the geological timescale. Drosera pollens have been recorded since Lower Miocene from New Zealand 34 . Several findings of tertiary pollen in the Mid Miocene from Europe have been assigned to either Drosera (Droserapollis) or Nepenthes (Droseridites) 35 . The molecular data calibration with cue from previous studies is in congruence with the fossil record information of Droseraceae pollen, thus testifying a wide distribution of the progenitors of Aldrovanda in the Droseraceae family since Late Cretaceous (Figure 7).

Ancestral area reconstruction
The RASP tree (Figure 8) indicated the phylogenetic roots of Dionaea and Aldrovanda to originate in the northern hemisphere, while Drosera species would have most probably had an Australasian origin. Apparently all palaeoendemics  (D. meristocaulis, D. burmannii, D. arcturi) are scattered throughout the southern hemisphere and also in tropical America. In this respect, the extant fossil record, i.e. European Miocene fossils, are somewhat noteworthy. A very old age (Cretaceous) can therefore be hypothesized for the whole family, dating back to stages of tectonic development when South America, Africa, and Australia were in closer proximity compared to present day geographical barriers. From the recent studies, it emerges that Australia is perhaps the secondary center of diversity of the genus Drosera, and most of the Drosera descendants can be assumed to have originated here.

Discussion
The family Droseraceae exhibited a monophyletic nature with representative species from Drosera, Dionaea, and Aldrovanda ( Figure 3 and Figure 4), confirming that these highly diverse plants merit further investigation with a higher number of markers from different genomic regions. Though this study targeted some popular markers from the nuclear and extra chromosomal regions, similar markers, other than rbcL, could not be found in the public repositories for other species in the Droseraceae family to substantiate our research findings. The carnivorous plants are excellent evolutionary models and despite their dramatic journey in the course of evolu-tion, which is no less than an interesting Gothic novel, botanical carnivory is a severely understudied area. In this study, family Droseraceae was revisited with present day investigative DNA marker tools from the chloroplast and nuclear regions trying to comprehend some of the bewildering scientific stories, which these meat-loving plants had to offer. Phylogenetic graphs based on the concatenated rbcL and ITS markers from the rDNA datasets exhibited 100% bootstrap support in most of the clades. The ML tree for the combined data set also showed that Dionaea and Aldrovanda form a sister group with 100% bootstrap values ( Figure 3). Though Drosera differs markedly from that of the snap trap system of Dionaea and Aldrovanda, some structures still have strong resemblance at molecular level reflecting homology between them. A strong similarity is seen in the cellular architecture of stalked glands of Drosera and trigger hairs of Aldrovanda, whose origin can be traced to adhesive glands seen in the Plumbaginaceae and other families that are out-groups to the family Droseraceae. This study hints at a common evolutionary origin of trapping mechanisms in Drosera, Dionaea and Aldrovanda. All these findings attest to the sister relationship of Dionaea and Aldrovanda, indicating a single evolutionary origin of an elaborate snap trap system in carnivorous plants.
The clade from D. barbigera to D. glanduligera in Figure 3 covers a wide range of species spread across Australasia. Species in this clade are well adapted to dry environments and have tubers and stout roots. Subspecies with each adaptive trait forms a different clade. Except D. pygmaea, other species in section Bryastrum have pentamerous flowers and are endemic to southwestern Australia. D. pygmaea has been placed in a different section owing to its unique distributary features and tetramerous flowers, 36,37 . This implies that tetramerous flowers are an autapomorphic character and would have evolved from the pentamerous flowers shared by other pygmy sundews. More work needs to be done for understanding the different sections of Drosera for systematic revision of these plants.
For the species D. burmannii and D. sessilifolia of section Thelocalyx, plesiomorphic pollen features are quite apparent with simple cohesion similar to Aldrovanda and Dionaea, instead of cross wall cohesion as observed in other Drosera species, except D. glanduligera. These two species share a common ancestor with 100 bootstrap values in the Bayesian phylogeny ( Figure 4). The overall topology of the Droseraceae family though monophyletic, the genus Drosera showed a polyphyletic nature with so many subclades within the tree. D. uniflora and D. stenopetala formed a sister group, which is also supported by their similar morphological characters. The clades from D. capillaris to D. hamiltonii (Figure 3 and Drosera species grouped into separate clades. Further, TaxonGap analysis speculates the combinatorial use of ITS and rbcL markers to design smart barcodes in delineating and discriminating species with high-resolution power ( Figure 6).
Though the phylogenetic reconstruction approach could reveal some clades to be in sync with morphological characters and geographic distribution of Drosera species, it becomes imperative to advance the phylogeny research with a genome to phenome approach by targeting more species and new markers from the genomes of this highly diverse and interesting group of plants.

Biogeographic hypotheses
It is widely believed that Drosera has colonized itself in both the hemispheres 37 and Australia happens to be the center of diversity of Drosera species, where more than 80 species thrive [38][39][40] . Over 30 species are distributed in Northern Africa and half of the species are distributed in South Africa. South America also has about 30 species, some of which have migrated into North America. Eurasia and North America harbor nearly 10 species although some of these species are cosmopolitan. Aldrovanda is widely distributed in both the hemispheres, including Australia and Africa, while Dionaea is restricted to North America.
The different phylogenetic trees (Figures 3-5 and 7) corroborate some of the previous hypotheses on the origin and dispersal of Drosera species. Australia to South America dispersal could be seen in the clade that includes D. burmannii and D. sessilifolia. D. stenopetala has disjunct distributions in South America and New Zealand. New Zealand and South American Drosera species have been reported to share close relationships 41 , and there might be some unknown mechanism for long-distance dispersal between these two continents. There are reports on dispersal events to have occurred in D. burmannii, D. indica, and D. peltata from Australia to Asia via Southeast Asia without any proper explanation for such events. A large number of Drosera species are spread across the Southern hemisphere compared to the Northern hemisphere, which implies that the species in the Northern hemisphere (D. indica, D. capillaris D. burmannii, D. anglica, D. brevifolia, D. filiformis, D. peltata and D. rotundifolia) would have expanded their distributions to the Southern Hemisphere. Further analyses with more taxa would be required to confirm this inference.

Conclusions
The combinatorial use of different markers along with different computational tools, ideally the use of NeighborNet algorithm 42 , takes a different approach to inferring species relationships. A relationship network is drawn rather than restricting the data into a stubborn single line tree structure by incorporating MP trees onto a ML tree. The present study corroborates most of the findings by previous studies. The basal position of Droseraceae within the noncarnivorous Caryophyllales, indicated in the tree topologies and fossil findings, strongly support a date of origin for Droseraceae during the Paleocene (55-65 MYA). Contrary to this hypothesis, which makes the family more ancient, Rivadavia et al. 16 argue that the Droseraceae are located close to the tip of the angiosperm phylogenetic tree. Within Droseraceae, the sister relationship between Aldrovanda and Dionaea is supported by various rDNA marker [ITS (18s, ITS1, 5.8s, ITS2, 28s) + rbcL] dataset. Our studies would further help in comparative and experimental studies using carnivorous taxa with similar strong selective pressures. Drosera species are thus genuine plant model systems for addressing a wide array of questions concerning evolutionary and ecological studies governing botanical carnivory.
The remaining sequences from previous studies were downloaded from GenBank at NCBI and are outlined in Table 1.