Analysis of the Arabidopsis organellar rhomboid At1g74140 transcript population uncovered splicing patterns different from its close relative At1g74130 [version 1; peer review: 2 not approved]

Background: Four distinct rhomboid genes appear to function in Arabidopsis plastids, two “active” types from the secretases and presenilin-like associated rhomboid-like (PARL) categories (At1g25290 and At5g25752) and two “inactive” rhomboid forms (At1g74130 and At1g74140).  The number of working rhomboids is further increased by alternative splicing, two reported for At1g25290 and three for At1g74130.  Since At1g25290 and At1g74130 exist as alternative splice variants, it would be necessary to assess the splicing patterns of the other two plastid rhomboid genes, At5g25752 and At1g74140, before studying the Arabidopsis plastid rhomboid system as a whole. Methods: This study thus specifically focused on an analysis of the At1g74140 transcript population using various RT-PCR strategies. Results: The exon mapping results indicate splicing patterns different from the close relative At1g74130, despite similarity between the exonic sequences.  The splicing patterns indicate a high level of sequence “discontinuity” in the At1g74140 transcript population with a significant portion of the discontinuity being generated by two regions of the gene. Conclusion: The overall discontinuous splicing pattern of At1g74140 may be reflective of its mode of involvement in activities like controlling gene expression.


Introduction
The rhomboid protein gene system in Arabidopsis is complex with 13 transcribed genes encoding both "active" and "inactive" rhomboid protein types (Knopf & Adam, 2012;Koonin et al., 2003;Tripathi & Sowdhamini, 2006). A recent survey of the databases indicates at least 22 entries for Arabidopsis (Powles & Ko, 2018). These entries encompass the same range of rhomboid types as that observed in other organisms (Lemberg & Freeman, 2007). One intriguing aspect about these different rhomboids is that they appear to operate simultaneously, such as in the Arabidopsis plastid compartment. In Arabidopsis, there are four distinct plastid rhomboid genes predicted -two encoding "active" types (At1g25290 and At5g25752) and two encoding "inactive" forms (At1g74130 and At1g74140). This number is further increased by alternative splicing, as reported for At1g25290 and At1g74130 (Powles et al., 2013;Sedivy-Haley et al., 2012). Alternative splicing of At1g25290 transcripts introduced variant forms with and without a putative cyclin-binding RVL motif (Sedivy-Haley et al., 2012). For At1g74130, alternative splicing generated three variants with different carboxyl segments, changing from a predicted seven to a six transmembrane structure (Powles et al., 2013). These three At1g74130 variants exhibited different functionalities, such as their interactions with the Tic40 substrate and the yeast mitochondrial rhomboid protease Rbd1 and Mgm1 in live cell contexts, and their abilities in sensitizing cells to antimicrobial drugs (Powles et al., 2013;Powles & Ko, 2018). These different functionalities are especially interesting since one potential role of At1g74130 appears to be in the early stages of tissue development and plastid biogenesis (Sedivy-Haley et al., 2011).
The picture emerging for the Arabidopsis plastid rhomboids suggests that alternative splice variants are involved in regulatory type activities. Such a phenomenon is observed in other organisms, such as for the human rhomboid gene RHBDD2, as an example (Abba et al., 2009). The levels of the two alternatively spliced RHBDD2 mRNAs were elevated in breast cancer cell lines (Abba et al., 2009). In terms of other regulatory roles, rhomboid proteins themselves are often associated with regulatory activities in different organisms and in a range of cellular pathways, such as development, mitochondrial membrane remodeling, protein transport, quorum sensing, and stress response (see examples in Adrian & Freeman, 2012;Jeyaraju et al., 2013;Lemberg, 2013;McQuibban et al., 2003;Sedivy-Haley et al., 2011;Sekine et al., 2012;Stevenson et al., 2007;Thompson et al., 2012;Urban, 2006;Yan et al., 2008). The possibility of alternative splicing contributing further to these different types of regulatory activities is thus intriguing and warrants investigation.
It is now clear that to understand how the plastid rhomboid system works, it is necessary to identify all potential rhomboid variants in play. This study was thus designed to examine the splicing activities of At1g74140, a close relative to At1g74130 and located tandemly next to At1g74130. An analysis of the At1g74140 transcript population would help uncover potential modes of operation for this gene and its possible role in Arabidopsis plastids.

Methods
Arabidopsis and the propagation regime used Growth chambers were used to propagate the plants employed in this study. The cultivation conditions for the Arabidopsis line CS60000 ("wild-type") were derived from information posted on the Arabidopsis Biological Resource Center (Ohio State University) website (http://abrc.osu.edu) (Arabidopsis Biological Resource Center, RRID:SCR_008136) (Alonso et al., 2003). The growth chamber parameters used were: 21°C; 70% humidity; and a 16:8 hour light:dark photoperiod using 150-200 μmol·m -2 ·s -1 of fluorescent and incandescent lighting. Seeds were cold stratified for 3 days before transfer to growth chambers. All plants were subjected to a regime of daily watering and weekly fertilization.

PCR-based procedures for the analysis of transcript populations
Procedures and optimization details related to the analysis of transcript structure were the same as those described by Sedivy-Haley et al. (2012). Briefly, all plant tissues destined for RNA extraction were collected between 11:00 and 13:00 h during the light period to equalize the circadian effects. Total RNA was isolated and treated with DNase using Qiagen kits (RNeasy kit and RNase-free DNase set, Qiagen, Hilden, Germany) (QIAGEN, RRID:SCR_008539). Transcript structure was studied using Qiagen PCR kits (One-Step RT-PCR kit, Qiagen, Hilden, Germany) (QIAGEN, RRID:SCR_008539) and a set of exon-exon specific DNA primers. A summary of the DNA primers used is provided in the Extended data: Table S1. The RT-PCR steps and cycle timings used were as recommended by the manufacturer, Qiagen, without modifications. The RT-PCR assays were all conducted with 20 ng of total RNA with cycles between 35 and 40 to visualize lower level products. The temperature used was 55°C. Prior optimization RT-PCR assays were conducted with varying temperatures, cycles, and total RNA amounts. All assays were conducted using an Eppendorf Mastercycler for microplates. PCR products were resolved by running samples in standard 4% (w/v) Tris-borate -EDTA polyacrylamide gels. The DNA markers used were purchased from Frogabbio (Toronto, Ontario, Canada).

Comparison of At1g74140 and the nearby related At1g74130
There are currently four genes predicted for plastidial rhomboid proteins in Arabidopsis (Kanaoka et al., 2005;Knopf & Adam, 2012;Koonin et al., 2003;Tripathi & Sowdhamini, 2006). Three are located in chromosome 1 and one in chromosome 5. Our previous work on the transcript structures of At1g25290 and At1g74130 revealed alternatively spliced variants, two for At1g25290 and three for At1g74130 (Powles et al., 2013;Sedivy-Haley et al., 2012). The other rhomboid gene in chromosome 1, At1g74140, is highly similar to At1g74130 and located downstream from At1g74130. At1g74130 is located between nucleotide positions 27873611 and 27876033, and At1g74140 occupies 27876905 to 27879780. The overall exon-intron structures of At1g74130 and At1g74140 are similar, each with 8 exons and 7 introns ( Figure 1A). Although the exon lengths are similar, there appear to be substantial differences in the lengths of the middle introns 3, 4, and 5 ( Figure 1A). A comparison of the translated products from the available At1g74130 and At1g74140 cDNA entries or sequences (five and three, respectively) also indicated a high degree of similarity with a few gaps ( Figure 1B). The differences in intronic composition may thus be reflective of the roles of the two genes and this notion continues to be the case when examining the outcomes of the At1g74140 transcript structure analysis below.

Analysis of At1g74140 transcript structure and its splicing activities
For the analysis of At1g74140 transcripts, the overall RT-PCR strategy and primer pairs used were designed to map alternative splicing activities in the context of transcript populations (Table S1 summarizes the details of the exon-exon specific primers used). These primers should allow detection of the many differently spliced exonic structures in the transcript population, along with changes to levels of neighboring exons represented in the resulting products. Such modulations in RT-PCR product levels should be indicative of different transcript splicing activities. This approach was employed to characterize alternatively spliced variants in our previous studies on At1g25290 and At1g74130 (Powles et al., 2013;Sedivy-Haley et al., 2012).
A combined RT-PCR strategy, stepped and laddered primer pairs, was used to characterize the different structures that existed in the transcript population. The outcomes were sorted and summarized schematically in Figure Figure 5- Figure 10. The presence or absence of each exon and their relative levels were surveyed and compiled. These exon specific RT-PCR outcomes are sorted and organized schematically as black lines in Figure 2 (below the gene map). The weight of the black lines, for example, thick versus thin, represents relative RT-PCR product levels. From this summation map of the stepped RT-PCR strategy, we were able to observe missing exon-exon products as well as unbalances in the relative levels of neighboring exon-exon products. Products primed from the start region of exon 1 and forward          region of exon 8 were not detected. Products primed from the latter end of exon 1 to exon 7 were present, but exhibited changes to their relative levels, even if the neighboring RT-PCR products were expected to be comparable with respect to size and RT-PCR assay parameters.
The outcomes were next sorted and organized as sequential sets of laddered primers in the 5' to 3' direction. Different splicing activities were detected by the appearance or disappearance of different-sized RT-PCR splice products, and by changes in the levels of RT-PCR products derived from neighboring exons (the red lines in Figure 2-Figure 4). The overall patterns generated indicate that At1g74140 transcripts were not processed as predicted. Primers derived from the 5'UTR, or the start of exon 1, or the end of exon 8, or 3'UTR, did not consistently result in RT-PCR products that spanned the entire open reading frame. Products that did arise with the exon 8 or 3'UTR primers were lower than the levels observed for many of the products generated by the other primer pairs. This pattern is different from the closely related At1g74130 (Powles et al., 2013;Sedivy-Haley et al., 2012). The overall pattern displayed for At1g74140 indicates a high level of alternative splicing activity with extensive discontinuity in the resulting products.
The high level of discontinuity appears to originate mainly from two regions, from sequences spanning exon 2 and exon 7 (Figure 3 and Figure 4). Both regions appear to separately and in combination contribute to the observed discontinuity phenomenon. Signs indicating splicing discontinuity manifest as disappearances of predicted RT-PCR products and lower levels of predicted products as a result of re-distribution into differently spliced transcript sub-populations. These occurrences are represented as red lines in Figure 3 and Figure 4 (hatched for missing, solid for present, and thickness for lower relative levels). The patterns, when considered together, suggest that the sequence spanning the 5'end of exon 1 to approximately exon 6 is frequently separated from the latter section spanning approximately from exon 6 to exon 8. The resulting products from the two regions appear to be distributed into different transcript sub-populations.

Discussion and conclusions
The Arabidopsis rhomboid system consists of at least 13 expressed full-length genes for active and inactive types that work in different cellular locations (Knopf & Adam, 2012;Koonin et al., 2003;Lemberg & Freeman, 2007;Tripathi & Sowdhamini, 2006). Of these 13 genes, two are for active plastidial types (At1g25290 and At5g25752) and two are for inactive plastidial types (At1g74130 and At1g74140) (Knopf et al., 2012;Sedivy-Haley et al., 2011;Sedivy-Haley et al., 2012;Thompson et al., 2012). The number of plastid rhomboid variants are further increased by alternative splicing, such as that reported for At1g25290 (Sedivy-Haley et al., 2012) and At1g74130 (Powles et al., 2013). It is thus important to characterize the status of all rhomboid forms before studying the plastid rhomboid system. This particular study focuses on a close relative of At1g74130, At1g74140.
The various RT-PCR outcomes obtained in this study, together, uncovered a complex, atypical alternative splicing pattern for At1g74140. In contrast to the alternative splicing patterns of At1g25290 and At1g74130 where each generated defined variants, the splicing activities of At1g74140 produced an array of shorter, spliced transcripts derived from different exonic stretches of the gene sequence. Unlike its close relative At1g74130, spliced At1g74140 transcripts that span the entire gene sequence were not produced at our level of detection and are thus not represented as a significant pool in the transcript population. This outcome suggests that the function of At1g74140 was likely not based on the full length of the gene as for At1g74130, but on many shorter stretches. It is not known at this juncture if the different shorter, spliced transcripts are functional at the translational level. The shorter, spliced transcripts themselves are more likely to represent functional entities, such as microRNAs or RNA interference.
The complex splicing pattern of At1g74140 appears to be triggered predominantly by two regions, the stretch spanning exon 2 and the stretch spanning exon 7. Many of the spliced products predicted to span exon 2 and/or 7 were absent or produced in lower than expected relative levels. The lowering of relative levels is also due to the splitting of the one predicted pool into several pools with different spliced transcript variants. A preliminary analysis of sequences and RT-PCR products from the two regions revealed microRNA possibilities, but this is being further investigated in depth at this point.
Despite our limited understanding of the roles played by splice variants, we have witnessed the employment of alternative splicing as a mechanism for diversifying plastid rhomboid protein function in Arabidopsis. Alternative splicing is used to create protein variants with changes to functionality for At1g25290 and At1g74130. Here in this report, the data provide evidence that another Arabidopsis plastid rhomboid gene, At1g74140, carries the potential to function through its alternatively spliced short transcript products. This function could be in the form of shorter translational products and/or short RNAs. This potential is quite different from its close relative At1g74130. Preliminary studies with different developmental tissues indicate that some of the alternatively spliced short transcript products fluctuate between mature and young tissues, hence the short transcripts appear to bear functional significance (see Underlying data: Figures S1-S3). For now, the role(s) played by the short transcripts of At1g74140, although intriguing they may be, awaits elucidation.
This project contains the following underlying data: • Figure S1. RT-PCR results for various primer combination set 1 and the two developmental stages analyzed.
• Figure S2. RT-PCR results for various primer combination set 2 and the two developmental stages analyzed.
• Figure S3. RT-PCR results for various primer combination set 3 and the two developmental stages analyzed.

Extended data
Zenodo: Splicing patterns of the Arabidopsis organellar rhomboid At1g74140 - Alternative splicing (AS) definitely contributes to the complexity of gene regulation at posttranscriptional level. To understand if and how AS is involved in the regulation of gene expression, for a particular multiexon gene, it will be necessary to determine the pattern of alternative splicing and to identify functional splice variants. In this study by Ko et al. report their alternative splicing analysis of an Arabidopsis gene At1g74140. They applied RT-PCR approach with a variety of primer sets to detect alternative spliced variants of At1g74130. Main results obtained in this study are that alternative splicing of At1g74140 was mainly attributed to two regions of this gene. Based on the RT-PCR results, a conclusion was reached by the authors that the overall discontinuous splicing pattern of the At1g74140 gene may reflect its mode of involvement in activities like controlling gene expression.

Comments:
Since the study just verified the splice variants annotated in public available database, no new variant was detected. RT-PCR can be no doubt used to detect particular type of splicing events, but this method is prone to false positives, especially when too many PCR cycles were used. Given the PCR cycles used in this study were 35-40, it is difficult to conclude that some of the faint bands saw in the gel were come from mRNA rather than contaminations. At least, no RT samples should be included as negative controls in these PCR experiments, which was missing. Further, the conclusion that the discontinuous splicing pattern of the At1g74140 gene is associated with gene expression does not have any support from these RT-PCR analysis; no gene expression analysis was conducted to show any of these splice variant contribute to gene expression levels. The manuscript entitled "Analysis of the Arabidopsis organellar rhomboid At1g74140 transcript population uncovered splicing patterns different from its close relative At1g74130" by Ko and colleagues is an attempt to characterize the alternative splicing patterns of At1g74140 in comparison to the previously characterized one, At1g74130. After carefully reading the paper and carefully analyzing the methods, I have to sadly say that I do not think this manuscript represents an advance in the knowledge of these genes (or At1g74140 in particular), since the alternative splicing isoforms are already annotated in JBrowse, Araport and TAIR, and I do not think the methods used are appropriate to search for (or validate) alternative splicing isoforms. The use of different overlapped/stepped PCR amplicons to check for "different splicing activities" is not a reliable method. PCR is extremely sensitive and with enough cycles could amplify even noise. Primer design is wrong in many cases. When the authors use primers that should span exon junctions they have far too many bases on the 3' end of the primers. This means that these oligos could partially bind (and with partially I mean 13-16 bases) to a target and give a PCR product that would be misleading. When generating a primer for a junction the authors must limit the binding to the 3' exon/region to only 3-5 bases. Moreover, these 2 genes do not show any expression at all in the RNA-seq data on JBrowse in any condition there. In addition, I was analyzing other RNA-seq data, publicly available, to see if I could find if these genes are expressed at all, and I wasn't able to find trustable reads for these loci in several conditions tested. From my point of view, the expression of these genes, and of At1g74140 in particular, is marginal (in several conditions). Hence, any conclusion based on the expression of a few, or just some partial, RNA molecules wouldn't be reliable. As I was saying before, by using PCR you can get a signal even from the-almost-non-existent expression of a gene. Since there are no amplification controls, for example using genomic DNA to test the primers, when there is no amplification we cannot say that a particular cDNA (corresponding to an isoform) is absent or that a particular combination of exons is not in the sample. The lack of amplification of the longer amplicons ("full length RNA") should be taken as a signal of a problem. Since, as expressed before, there are no amplification controls and some primers do work in some combinations or in shorter amplicons, I would guess that the expression of the "full length RNA" is so low that the authors are just able to get some partial RNAs that are most likely expression noise. The authors do not inform the amount of cycles they used; hence, I do not know for sure whether the expression is low or not. However, I am concluding this from the intensity of the bands in the gels, the lack of amplification of the longer amplicons, and the fact that there are no reads at all for this gene in different RNAseq experiments. The authors should provide this information, so the reader knows how many cycles are needed to get a signal from the different amplicons. In addition, if the authors have evidence that expression is not marginal in their conditions, they should add that into this manuscript. Moreover, showing different conditions where splicing patterns are affected would also increase the impact of this work. Besides these main issues I would also like to add that the conclusion that the shorter, spliced transcripts, could be functional entities such as miRNAs is far far beyond the data presented. Furthermore, the authors introduce the terms "active"/"inactive" but they never explain the meaning of these in the context of these proteins. In addition, the authors discuss about splicing patterns but they do not see that in their data, and they introduce the concept (?) of 'sequence "discontinuity"' that it is not further explained and seems to be, actually, they are talking about alternative splicing.