Plastid genome of Passiflora tripartita var. mollissima (poro-poro) from Huánuco, Peru

Passiflora tripartita var. mollissima, known locally as poro-poro, is an important native fruit used in traditional Peruvian medicine with relevant agro-industrial and pharmaceutical potential for its antioxidant capacity for human health. However, to date, only a few genetic data are available, which limits exploring its genetic diversity and developing new genetic studies for its improvement. We report the poro-poro plastid genome to expand the knowledge of its molecular markers, evolutionary studies, molecular pathways, and conservation genetics. The complete chloroplast (cp) genome is 163,451 bp in length with a typical quadripartite structure, containing a large single-copy region of 85,525 bp and a small single-copy region of 13,518 bp, separated by a pair of inverted repeat regions (IR) of 32,204 bp, and the overall GC content was 36.87%. This cp genome contains 128 genes (110 genes were unique and 18 genes were found duplicated in each IR region), including 84 protein-coding genes, 36 transfer RNA-coding genes, eight ribosomal RNA-coding genes, and 13 genes with introns (11 genes with one intron and two genes with two introns). The inverted repeat region boundaries among species were similar in organization, gene order, and content, with a few revisions. The phylogenetic tree reconstructed based on single-copy orthologous genes and maximum likelihood analysis demonstrates poro-poro is most closely related to Passiflora menispermifolia and Passiflora oerstedii. In summary, our study constitutes a valuable resource for studying molecular evolution, phylogenetics, and domestication. It also provides a powerful foundation for conservation genetics research and plant breeding programs. To our knowledge, this is the first report on the plastid genome of Passiflora tripartita var. mollissima from Peru.

Plastome sequences from over 4000 species (Zhou et al., 2021) are small in size with high copy numbers and conserved sequences, enabling a significant understanding of plant molecular evolution, structural variations, and evolutionary relationships of plant diversity (Daniell et al., 2016;Dobrogojski et al., 2020).The plastid genome has a quadripartite structure: a large single-copy (LSC) of 80-90 kilobase pairs (kb), a small single-copy (SSC) of 16-27 kb, and two sets of inverted repeats (IRa and IRb) of 20-28 kb, with 110-130 unique genes, including protein-coding genes, transfer RNA (tRNA), and ribosomal RNA (rRNA) (Ozeki et al., 1989;Wang & Lanfear, 2019).In recent years, declining genome sequencing costs resulted in more than 780 complete plant genomes of different species becoming available (Marks et al., 2021;Sun et al., 2022) (Shrestha et al., 2019), became publicly available.However, despite the scarcity of genomic information on underutilized crops (Gioppato et al., 2019), we have only begun to investigate the genomics of plants of great importance for plant breeding programs.The purpose of this research was to obtain the poro-poro plastid genome, which constitutes a valuable resource for studying the molecular evolution, phylogenetics, and domestication of species with beneficial characteristics for human health.In the present study, we report the first plastid genome sequence submitted for an isolate of Passiflora tripartita var.mollissima, and important native fruit of Peru.

REVISED Amendments from Version 2
We update the circular genome map ensuring a more concise and comprehensible representation.

Phylogenetic analysis
We used 26 complete plastome sequences to infer the phylogenetic relationships among Passiflora species, and Vitis vinifera was used as an outgroup (see the Extended data, Aliaga et al., 2023c).Single-copy orthologous genes were identified using the Orthofinder version 2.2.6 pipeline (Emms & Kelly, 2019).For each gene family, the nucleotide sequences were aligned using the L-INS-i algorithm in MAFFT v7.453 (Katoh & Standley, 2013).A phylogenetic tree based on maximum likelihood (ML) was constructed using RAxML v8.2.12 (Stamatakis, 2014) with the GTRCAT model.A phylogenetic ML tree was reconstructed and edited using MEGA 11 (Tamura et al., 2021) with 1000 replicates.

Contraction and expansion of the IR boundary
In this study, the IR boundary analysis of four Passiflora species revealed that the structure and sequences of four junctions, JLB (junction between LSC and IRB), JSB (junction between SSC and IRB), JSA (junction between SSC and IRA), and JLA (junction between LSC and IRA), between the two inverted repeats (IRa and IRb) and the two single-copy regions (LSC and SSC) of P. tripartita var.mollissima, P. oerstedii (147 073 bp; Genbank accession: NC_038124), P. foetida (162 266 bp; Genbank accession: NC_043825), and P. edulis (151 406 bp; Genbank accession: NC_034285) were similar (Figure 2).The genes of rps3, rps19, rpl2, rps15, ycf1, ndhF, ndhH, and psbA were located mainly near the IR/LSC and IR/SSC boundaries of the plastome for these four species of Passiflora.In the same order that was described, rps3 is entirely located in the LSC region, at distances of 206 bp, 264 bp, 159 bp, and 206 bp, respectively, from the JLB boundary.For rps19, which is in both IR regions, the nucleotide distance from the JLB boundary varies from 128 -210 bp.In P. oerstedii, both copies of the rps2 gene are in the IR región, and the ndhH gene is located in the SSC region.
The rps15 gene crossed the SSC/IRb boundary, expanding 243 bp and 17 bp in P. tripartita var.mollissima, respectively.The rps15 gene is located 182 bp away from the SSC/IRa boundary in P. foetida and is located at the end of the SSC region, expanding 81 bp and 20 bp in P. oerstedii and P. edulis, respectively.In all species compared, the ndhF gene is located 234 bp away from the SSC/IRa boundary in P. tripartita var.mollissima, and is located 29 bp, 14 bp, and 219 bp away from the SSC/IRb boundary in P. oerstedii, P. foetida, and P. edulis.Furthermore, the ycf1 gene in P. oerstedii, P. foetida, and P. edulis is located 266 -481 bp away from the SSC/IRa boundary, except for P. tripartita var.mollissima, which was not present in JSA.
The infA gene, which codes for translation initiation factor 1, is present in P. tripartita var.mollissima, but it is absent from the P. foetida, P. oerstedii, and P. edulis cp genomes.Furthermore, trnG-UCC and ycf68 are unique genes in P. foetida and P. edulis, respectively.The plastome of P. tripartita var.mollissima contained seven genes (ycf1, ycf2, ycf15, rpl20, rpl22, accD, infA) that were lost or non-functional genes in P. edulis; and compared to P. foetida, P. oersteddi, and P. edulis, the trnfM-CAU gene was not found.

Phylogenetic reconstruction
To identify the evolutionary position of Passiflora tripartita var.mollissima in the Passifloraceae family, phylogenetic relationships based on the OrthoFinder clustering method were used to avoid erroneous rearrangements in phylogenetic tree reconstruction and provides a more reliable evolutionary analysis (Gabaldón, 2005;Zhang et al., 2012).The phylogenetic tree was constructed based on single-copy orthologous genes (Emms & Kelly, 2019) and maximum likelihood analysis with the complete annotated protein sequences of 27 plastid genomes, of which 26 were from Passiflora species.One species, Vitis vinifera, was chosen as the outgroup.
Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).
in Fig. 1.Upon careful examination, I have observed that certain genes, including rrn23, rrn23-fragment, infA, rpoA, etc., bear double annotations in the figure.I appreciate the authors' efforts in incorporating the suggested changes throughout the manuscript.However, I recommend addressing this specific matter to enhance clarity for readers.
To rectify this, I propose that the extra annotations be removed from the GenBank file before generating the diagram.This correction will streamline the information presented in Fig. 1, ensuring a more concise and comprehensible representation for new readers.I believe implementing this adjustment will further enhance the overall quality of the manuscript.
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Organelle genome sequencing, Transcriptome assembling, Genetic diversity, Barcoding and DNA markers etc I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Answer.It has been revised, thank you.
5. The authors used many programs for annotations.I will suggest authors use GeSeq along with Chole, Aragorn and tRNAScan on GeSeq servers for annotation of transfer RNA.
Answer.It has been revised, thank you.
6.The authors need to give some information on how the current genome is similar to or differs from other reported genomes of Passiflora Answer.Thank you for your comment.We have added the figure 2 titled "Comparison of IR/SC boundary regions of four Passiflora species".

Rahul G Shelke
Independent Researcher, Amravati, India I extend my appreciation to the authors for their commendable efforts in extensively exploring the plastid genome of Passiflora tripartita var.mollissima, demonstrating remarkable depth and breadth in their approach.The manuscript substantially enhances our comprehension of this species, emphasizing its plastid genome, genetic diversity, and evolutionary relationships.This newfound insight carries far-reaching implications, fostering the conservation and sustainable utilization of this species.Consequently, the manuscript's significance is markedly elevated, making a noteworthy contribution to the scientific community's understanding of genetic resource stewardship.
Here is a concise summary of the study's key points: Implications and Novelty: The findings of the study hold significant implications.The genetic insights uncovered have the potential to catalyze further investigations in areas such as molecular evolution, conservation genetics, and the development of plant breeding programs.Notably, the manuscript's distinction as the first documentation of the plastid genome of Passiflora tripartita var.mollissima from Peru underscores its novel contribution to the scientific community. 1.
Methodological Rigor: The authors' selection of the Illumina Novaseq 6000 platform for DNA sequencing is well-suited to the research objectives.The subsequent analyses and methodologies applied exhibit robustness and alignment with the research goals. 2.

Major comments:
While the genes in Table 2 have been organized into categories, it is apparent that certain instances lack gene information or are unclear in their presentation.This calls for a restructuring of the table to ensure a heightened level of clarity and coherence. 1.
In Table 2, the authors have indicated the absence of a duplicated copy of the rrn4.5 gene.
It is advised that the authors review the annotation of this gene.Typically, rRNA genes are located within the IR regions and therefore exist in duplicated copies.However, in Figure 1 of the chloroplast genome, the rrn4.5 gene is depicted in a duplicated copy.I recommend that the authors rectify any discrepancies pertaining to the rrn4.5 gene both in the table and within the text.

2.
The authors have provided comprehensive insights into the expansion and contraction of the LSC, SSC, and IR regions in the compared species.To further enhance reader comprehension, I recommend that the authors consider incorporating the corresponding IRSCOPE figure within the manuscript.This addition would allow readers to access detailed information more effectively and contribute to a clearer understanding of the discussed concepts.

3.
Kindly provide information regarding the genes that are missing or subject to gene loss within the compared species.

4.
In the abstract, the authors have included the statement: "In summary, our study provides the basis for developing new molecular markers that constitutes a valuable resource for studying molecular evolution and domestication."However, upon reviewing the manuscript, it is evident that the authors have not conducted any SSR, tandem repeat, or DNA diversity analysis, nor have they proposed any markers.Given this situation, I recommend that the authors consider removing the phrase "developing new molecular markers" from the abstract to accurately reflect the scope of their study.

Minor comments:
The inclusion of specific methodology details such as "Total genomic DNA was extracted from fresh leaves (herbarium voucher: USM:MHN331531)" and "The DNA was sequenced using Illumina Novaseq 6000 platform" in the abstract might be better suited for the methodology section rather than the abstract itself.Focusing the abstract on the core findings and implications of the study could enhance its conciseness and clarity. 1.
Add the full botanical name of Passiflora tripartita in the phylogeny tree to ensure consistency.

2.
Gene names should be in italics.

3.
It is recommended to include the coverage of the genome assembly within the main text of the manuscript.

4.
The authors adeptly tackle gaps in our understanding of this species' genetics, comparative genomics, and evolutionary relationships.By thoughtfully engaging with reviewer feedback and subsequently refining their manuscript, the authors stand poised to elevate the impact and significance of their research within the realm of plant genomics and beyond.This process will significantly enhance the accuracy and reliability of gene classification and annotation, thereby elevating the overall quality of the manuscript.
Are the rationale for sequencing the genome and the species significance clearly described?Yes

Are the protocols appropriate and is the work technically sound? Yes
Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others?fact, we checked and updated the duplicated copy of the rrn4.5 gene.Thank you.
3. The authors have provided comprehensive insights into the expansion and contraction of the LSC, SSC, and IR regions in the compared species.To further enhance reader comprehension, I recommend that the authors consider incorporating the corresponding IRSCOPE figure within the manuscript.This addition would allow readers to access detailed information more effectively and contribute to a clearer understanding of the discussed concepts.
Answer: Thank you for your comment.According to your suggestions, we have added the figure 2 titled "Comparison of IR/SC boundary regions of four Passiflora species".4. Kindly provide information regarding the genes that are missing or subject to gene loss within the compared species.
Answer: Thank you for your reminding.Done. 5.In the abstract, the authors have included the statement: "In summary, our study provides the basis for developing new molecular markers that constitutes a valuable resource for studying molecular evolution and domestication."However, upon reviewing the manuscript, it is evident that the authors have not conducted any SSR, tandem repeat, or DNA diversity analysis, nor have they proposed any markers.Given this situation, I recommend that the authors consider removing the phrase "developing new molecular markers" from the abstract to accurately reflect the scope of their study.
Answer.We agree with the reviewer.The manuscript has been corrected: "In summary, our study constitutes a valuable resource for studying molecular evolution, phylogenetics, and domestication…"

○
Minor comments: 1.The inclusion of specific methodology details such as "Total genomic DNA was extracted from fresh leaves (herbarium voucher: USM:MHN331531)" and "The DNA was sequenced using Illumina Novaseq 6000 platform" in the abstract might be better suited for the methodology section rather than the abstract itself.Focusing the abstract on the core findings and implications of the study could enhance its conciseness and clarity.
Answer.Yes, we agree with the reviewer.Done.
2. Add the full botanical name of Passiflora tripartita in the phylogeny tree to ensure consistency. Answer.Done.
3. Gene names should be in italics.It is recommended to include the coverage of the genome assembly within the main text of the manuscript. Answer.Done.
4. The authors adeptly tackle gaps in our understanding of this species' genetics, comparative genomics, and evolutionary relationships.By thoughtfully engaging with reviewer feedback and subsequently refining their manuscript, the authors stand poised to elevate the impact and significance of their research within the realm of plant genomics and beyond.This process will significantly enhance the accuracy and reliability of gene classification and annotation, thereby elevating the overall quality of the manuscript.
Answer: We thank reviewer 1 for all constructive comments that led to substantial improvement of the manuscript.Thank you; it is much appreciated.
Competing Interests: No competing interests were disclosed.
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias • You can publish traditional articles, null/negative results, case reports, data notes and more • The peer review process is transparent and collaborative • Your article is indexed in PubMed after passing peer review • Dedicated customer support at every stage • For pre-submission enquiries, contact research@f1000.com . Recently, some Passiflora plastid genomes such as Passiflora edulis (Cauz-Santos et al., 2017), Passiflora xishuangbannaensis (Hao & Wu, 2021), Passiflora caerulea (Niu et al., 2021), Passiflora serrulata (Mou et al., 2021), Passiflora foetida (Hopley et al., 2021), and Passiflora arbelaezii

Figure 1 .
Figure 1.Plastid genome of Passiflora tripartita var.mollissima.The thick lines indicate the IR1 and IR2 regions, which separate the large single-copy (LSC) and small single-copy (SSC) regions.Genes marked inside the circle are transcribed clockwise, and genes marked outside the circle are transcribed counterclockwise.Genes are color-coded based on their function, shown at the bottom left.The inner circle indicates the inverted boundaries and guanine and cytosine (GC) content.

Figure 2 .
Figure 2. Comparison of IR/SC boundary regions of four Passiflora species.Boxes represent the nearby border genes.Gaps between the ends of boundaries and adjacent genes, as well as the sizes of gene segments positioned within a boundary, are depicted.The junction sites of LSC/IRb, IRb/SSC, SSC/IRa, and IRa/LSC are denoted as JLB, JSB, JSA, and JLA, respectively.

Figure 3 .
Figure 3. Phylogenetic tree of 27 plastid genomes using maximum likelihood analysis based on single-copy orthologous protein.Bootstrap values on the branches were calculated from 1000 replicates.
i tRNA coding genes 36Genes duplicated in IR regions 18

Table 1 .
Continued 1 Poro-poro is the common name of Passiflora tripartita var.mollissima in Peru.

Table 2 .
Genes present in the plastid genome of P. tripartita var.mollissima.