Pan-cancer chromatin analysis of the human vtRNA genes uncovers their association with cancer biology

Background: The vault RNAs (vtRNAs) are a class of 84-141-nt eukaryotic non-coding RNAs transcribed by RNA polymerase III, associated to the ribonucleoprotein complex known as vault particle. Of the four human vtRNA genes, vtRNA1-1, vtRNA1-2 and vtRNA1-3, clustered at locus 1, are integral components of the vault particle, while vtRNA2-1 is a more divergent homologue located in a second locus. Gene expression studies of vtRNAs in large cohorts have been hindered by their unsuccessful sequencing using conventional transcriptomic approaches. Methods: VtRNA expression in The Cancer Genome Atlas (TCGA) Pan-Cancer cohort was estimated using the genome-wide DNA methylation and chromatin accessibility data (ATAC-seq) of their genes as surrogate variables. The association between vtRNA expression and patient clinical outcome, immune subtypes and transcriptionally co-regulated gene programs was analyzed in the dataset. Results: VtRNAs promoters are enriched in transcription factors related to viral infection. VtRNA2-1 is likely the most independently regulated homologue. VtRNA1-1 has the most accessible chromatin, followed by vtRNA1-2, vtRNA2-1 and vtRNA1-3. VtRNA1-1 and vtRNA1-3 chromatin status does not significantly change in cancer tissues. Meanwhile, vtRNA2-1 and vtRNA1-2 expression is widely deregulated in neoplastic tissues and its alteration is compatible with a broad oncogenic role for vtRNA1-2, and both tumor suppressor and oncogenic functions for vtRNA2-1. Yet, vtRNA1-1, vtRNA1-2 and vtRNA2-1 promoter DNA methylation predicts a shorter patient overall survival cancer-wide. In addition, gene ontology analyses of vtRNAs co-regulated genes identify a chromosome regulatory domain, epithelial differentiation, immune and thyroid cancer gene sets for specific vtRNAs. Furthermore, vtRNA expression patterns are associated with cancer immune subtypes and vtRNA1-2 expression is positively associated with cell proliferation and wound healing. Conclusions: Our study presents the landscape of vtRNA chromatin status cancer-wide, identifying co-regulated gene networks and ontological pathways associated with the different vtRNA genes that may account for their diverse roles in cancer.

has three copies, annotated as vtRNA1-1 (98bp), vtRNA1-2 and vtRNA1-3 (88bp) genes, which are located in the chromosome region 5q31.3 (locus vtRNA1) and code for the RNAs that are integral components of the vault particle. Lately, a transcript initially annotated as the human microRNA precursor hsa-mir-886 (Landgraf et al., 2007), was re-classified as another human vtRNA homologue and consequently renamed as vtRNA2- 1 (Stadler et al., 2009). VtRNA2-1 is also located in chromosome 5 (5q31.1, locus vtRNA2) at 0.5Mb distance from the vtRNA1 cluster, where is placed between SMAD5 and TGFBI genes in an antisense direction. Existing evidence indicates that despite being a duplication of the vtRNAs of locus 1, the vtRNA2-1 is neither associated with the vault particle nor co-regulated with the vtRNA1 locus (Lee et al., 2011;Stadler et al., 2009). Nevertheless, the realization that only 5-20% of all the cellular vtRNA transcripts are associated to the vault particle suggests additional roles for the majority of vtRNA transcripts, independent of the vault particle (Kickhoefer et al., 1998;Nandy et al., 2009;van Zon et al., 2001). The vtRNA3-1P and vtRNA2-2P are two additional vault RNA genes situated in chromosome X and 2 respectively, but were classified as a pseudogenes due to the absence of expression in cell lines and the presence of silencing mutations at the promoter (Büscher et al., 2020;van Zon et al., 2001).
Most of the current knowledge of vtRNA function focuses on vtRNA1-1 and vtRNA2-1 roles in viral infection and cancer. Functional studies found that vtRNA1-1 and vtRNA1-2 (but not vtRNA1-3) can interact directly with drugs as mitoxantrone, doxorubicin and etoposide (Gopinath et al., 2005;Gopinath et al., 2010;Mashima et al., 2008). In addition, p62 protein interacts with vtRNA1-1 and inhibits the p62 dependent autophagy in vitro and in vivo (Horos et al., 2019). The vault RNAs have also been linked to native immune response, because of their strong upregulation during the infection of Influenza A virus (IAV) and Epstein-Barr virus (EBV) (Amort et al., 2015;Li et al., 2015;Nandy et al., 2009). These studies proposed a viral activation of host vtRNAs as a mechanism to maintain the inhibition of PKR signaling pathway, representing a viral strategy to circumvent host innate immunity.
Different lines of evidence revealed that the vtRNA transcription can be controlled by promoter DNA methylation. The epigenetic control of vtRNA2-1 is complex and owns clinical relevance in several solid tumors including breast, lung, colon, bladder, prostate, esophagus, hepatic and stomach cancer (Cao et al., 2013;Fort et al., 2018;Lee et al., 2014a;Lee et al., 2014b;Romanelli et al., 2014;Treppendahl et al., 2012;Yu et al., 2020). Intriguing aspects of the epigenetic regulation of vtRNA2-1 locus comprise its dependence on the parental origin of the allele (Joo et al., 2018;Paliwal et al., 2013), and its sensitivity to the periconceptional environment ((Silver et al., 2015) and subsequent independent studies (Richmond et al., 2018;van Dijk et al., 2018)). In addition, vtRNA1-3 promoter methylation was associated with significantly poor outcome in lower risk myelodysplastic syndrome patients (Helbo et al., 2015).
Currently, transcriptomic sequencing is the benchmark technique to study global RNA expression (Stark et al., 2019). Nonetheless, some classes of RNAs are elusive to the standard transcriptomics due to their stable RNA structure and the presence of modified bases or ends, which impair the cDNA synthesis and/or adapter ligation during the sequencing library preparation (Sendler et al., 2011;Zheng et al., 2015). The vtRNAs belong to this group due to their conserved stable stem/hairpin loop secondary structure. The latter, together with the lack of sequencing of 40-200bp-long RNAs in the conventional transcriptomic studies, which are mostly intended for small and long RNAs, probably delayed their study in comparison with other regulatory RNAs. In fact, the 50-300 nucleotides long RNAs are considered the black hole of RNA biology (Boivin et al., 2018;Steitz, 2015). Nonetheless, since chromatin accessibility plays a critical role in the regulation of gene expression, at some extent the transcription of an RNA can be inferred from the chromatin status of its promoter (Corces et al., 2018;Duren et al., 2017;Liu et al., 2018;Park et al., 2017b). Since mounting evidence has shown that vtRNAs expression is tightly controlled by chromatin accessibility, dependent on nucleosome positioning and promoter DNA methylation (Ahn et al., 2018;Fort et al., 2018;Helbo et al., 2015;Helbo et al., 2017;Lee et al., 2014a;Lee et al., 2014b;Park et al., 2017a;Park et al., 2017b;Sallustio et al., 2016;Treppendahl et al., 2012), the chromatin structure is a suitable surrogate marker of vtRNA transcription and could be used as a proxy of their expression.
The expression, regulation, and role of the four human vtRNAs in normal and disease conditions are still poorly understood, and available knowledge indicates that they hold diverse tissue specific activities. The availability of genomic data of large sets of human tissues provided by The Cancer Genome Atlas (TCGA), allows the study of the human vtRNAs across 16 tissue types and normal/cancer conditions. Here, we analyze vtRNA genes chromatin of the TCGA patient cohort withdrawn from two approaches: the assay for transposase-accessible chromatin followed by NGS sequencing (ATAC-seq) to analyze chromatin accessibility (Buenrostro et al., 2013;Buenrostro et al., 2015;Corces et al., 2018;Thurman et al., 2012) and the Illumina Infinium Human Methylation 450K BeadChip to analyze the CpG methylation of DNA (Moran et al., 2016). This led us to determine the patterns of transcriptional regulation of the four human vtRNAs throughout cancer tissues. We also evaluate the association between vtRNA promoter DNA methylation and the patient clinical outcome in all the available cancer types. Finally, looking for functional discoveries, we analyze vtRNA relations with immune subtypes and transcriptionally co-regulated gene programs. Our study reveals specific patterns of inferred expression for each vtRNA that support previous evidence and poses potential new roles and molecular programs in which they may participate across and intra cancer types, increasing the comprehension of the role of the human vtRNAs in health and disease.

ATAC-seq data
The ATAC-seq data obtained by The Cancer Genome Atlas (TCGA) consortium were retrieved from UCSC Xena Browser (Goldman et al., 2020) on 03/10/2018. It comprises the genomic matrix TCGA_ATAC_peak_Log2Counts_dedup_promoter with normalized count values for 404 samples (385 primary solid tumor samples across 23 tissues). As is described in UCSC Xena Browser to calculate the average ATAC-seq values a prior count of 5 was added to the raw counts, then put into a "counts per million", then log2 transformed, then quantile normalized; the result is the average value in the file (log2(count+5)-qn values) across all technical replicates and all biospecimens belonging to the same TCGA sample group. The gene promoter for ATAC-seq is defined as a region within -1000 to +100bp from the TSS site. Assignment of promoter peak to gene mapping information derives from the peak summit within the promoter region. Peak location information of the ATAC-seq values for 500 bp gene promoter regions was retrieved from the file TCGA_ATAC_Log2Counts_Matrix.180608.txt.gz. All the analyses were performed for tissues with at least five primary tumor samples (21 tissues), which resulted in the exclusion of CESC (2 primary tumor samples) and SKCM (4 primary tumor samples). Correlations between ATAC-seq and DNA promoter methylation were performed for the 329 primary tumors that are studied by both strategies (Extended Data: Tables S1 and S2).

Analysis of transcription factors associated to vtRNA genes
The transcription factors (TFs) occupancy analysis of vtRNAs was performed at a ± 3000 bp region centered at each vtRNA gene. The CHIP-seq data was retrieved from the UCSC genome browser (GRCh37/hg19) (Kent et al., 2002) and the K562 chronic myelogenous leukemia cell line was chosen since it has the richest TFs data of ENCODE dataset (ENCODE Transcription Factor Binding tracks -ENCODE 3 TFBS) (Euskirchen et al., 2007). The KEGG pathways enrichment analysis of the core 23 TFs common to all vtRNAs was done using STRING software (Szklarczyk et al., 2017) (Extended Data: Table S6).

Overall survival analysis
The Kaplan Meier curve analysis (Bland & Altman, 1998) was performed with software GraphPad Prism 6 using Log-ranked Mantel-Cox test. We analyzed the ATAC-seq normalized values and DNA methylation average normalized beta-values for vtRNA promoters and survival data until 4000 days (99% of data). The log-rank statistics was calculated for quartiles of vtRNAs promoter chromatin accessibility comparison groups (25 th and 75 th ) (Extended Data: Tables S7 and S8).

Pathway enrichment and cluster chromosome localization analyses
The genome wide correlation analysis of promoter chromatin accessibility (ATAC-seq normalized values) for each vtRNA in comparison with all the annotated genes was performed using the genomic matrix TCGA_ATAC_peak_Log2Counts_dedup_ promoter retrieved from UCSC Xena Browser (385 primary tumor samples across 23 tissues) (Goldman et al., 2020) on date 03/10/2018. The protein coding genes showing a Spearman correlation r ≥ 0.4 were selected for pathway and cluster location analysis. The pathway enrichment analysis was done using STRING software for the Gene Ontology Biological Process and KEGG pathways categories (Szklarczyk et al., 2017). The cluster localization analysis was performed with two approaches, the Cluster Locator algorithm (Pazos Obregón et al., 2018) and the Enrichr software (Kuleshov et al., 2016) (Extended Data: Tables S9 and S10).

Statistical analysis
Unless specified all the variables are expressed as average value ± standard deviation (SD). Statistical analyses were performed with one-way ANOVA for multiple comparison tests, including Sidak's Honest Significant Difference test as a post-hoc test, as referred in the legend of the Figures. Spearman equations were applied to determine the correlations. All the analyses were done in R software (version 3.6) using libraries ggcorrplot, corrplot, xlsx, heatmap.2 (clustering distance measured by Euclidean and Ward clustering algorithms) and software GraphPad Prism 6. The statistical significance of the observed differences was described using the p-value (* p < 0.05, ** p < 0.01, *** p < 0.001, **** p-value < 0.0001). Differences with a p-value of < 0.05 were considered significant.

Results
VtRNA genes have different chromatin accessibility at the promoter region The Cancer Genome Atlas consortium incorporated ATACseq data for tumor samples of the GDC Pan-Cancer dataset (385 primary tumors samples across 23 cancer types (Corces et al., 2018)). The ATAC-seq strategy is a genome-wide sequencing profiling approximation to chromatin accessibility using the hyperactive Tn5 transposase that simultaneously cut and ligate adapters preferentially to the accessible DNA regions. The counts of sequencing reads in a particular genomic region provides a direct measurement of chromatin accessibility (Buenrostro et al., 2015;Tsompana & Buck, 2014). For those gene promoters regulated by DNA methylation, ATAC-seq is a more direct approximation to chromatin accessibility than promoter CpG DNA methylation measurements because the latter is well correlated but not an exact measurement of chromatin accessibility. Indeed, the effect of DNA methylation on chromatin structure depends on the gene and the distribution of the methylation sites along it. It is also important to mention that gene expression interpretations based solely on chromatin accessibility are blind to post-transcriptional regulatory events such as processing, localization, and stability of the RNA transcript. Nevertheless, since the chromatin status of the different vtRNA promoters has been positively correlated with the abundance of their transcripts (Ahn et al., 2018;Fort et al., 2018;Helbo et al., 2015;Lee et al., 2014a;Lee et al., 2014b;Park et al., 2017a;Park et al., 2017b;Sallustio et al., 2016;Treppendahl et al., 2012), ATAC-seq is likely a good surrogate of vtRNA expression. DNA CpG methylation data is available for 746 normal adjacent (across 23 tissues) and 8403 primary tumor (across 32 tissues) tissue samples of the Pan-Cancer TCGA cohort (Extended Data: Tables S1-S5 and Figure S1). The full data is listed in Tables S1-S5 and the tissue composition of the three datasets of the cohort, comprising tumor ATAC-seq, normal and tumor DNA methylation, is shown in Figure S1. The DNA methylation data has nine more tissue types and a higher number of samples than the others, while the abundance of some tissues is skewed to twice enrichment in 1-4 specific tissue types.
To assess the expression of vtRNAs in cancer tissues, we analyzed a 500 bp region, containing the full vtRNA genes, the proximal promoter region and the RNA polymerase lII A and B box elements located downstream of the transcription start site (TSS) (Figure 1). Indeed, Vilalta et al., 1994 demonstrated that the deletion of the 300bp 5´-flanking region of the rat Genome browser (GRCh37/hg19) centered at the 500bp bin highlighted in yellow, which was used for ATAC-seq and CpG methylation analyses. VtRNAs of locus 1 cluster (vtRNA1-1, vtRNA1-2, vtRNA1-3) and of locus 2 (vtRNA2-1) are in the sense and antisense orientation respectively. Several Gene annotation and ENCODE Project tracks for seven cell lines (GM12878, H1-hESC, HSMM, HUVEC, K562, NHEK, NHL) are displayed: DNA accessibility (DNaseI hypersensitivity clusters (color intensity is proportional to the maximum signal strength)), DNA methylation (for CpG islands length greater than 200 bp), histone modification (H3K27Ac, H3K4me1, H3K4me3 marks), conservation of the region in 100 Vertebrates (log-odds Phylop scores). The vertical viewing range of the tracks displays the default settings of the browser for each variable.
vtRNA transcript largely reduced its transcription greater than 30-fold (Vilalta et al., 1994). As depicted in Figure 1, this region bears epigenetic marks of transcriptionally active chromatin (Li et al., 2007), including histone H3K27ac, H3K4me1, H3K4me3 at the promoters and DNaseI hypersensitivity clusters around the TSS in the cell types compiled by UCSC browser (Goldman et al., 2013). Interestingly, vtRNA2-1 is the only vtRNA immersed in a CpG island and vtRNA1-1 displays the strongest epigenetic marks of transcriptional activity ( Figure 1). The analysis of the ATAC-seq values of this 500 bp region of all the TCGA primary tumors reveals that the four human vtRNAs have different levels of chromatin accessibility ( Figure 2A). Moreover, the average DNA methylation of the promoters ( Figure 2B) in primary tumor samples supports this finding (32 tissues, Extended Data: Figure S1 and Tables S2 and S4). VtRNA1-1 promoter has the highest chromatin accessibility and the lowest DNA methylation, showing the smallest dispersion of values among the samples (Ave. 4.1, 0.6 SD) relative to the other three vtRNAs. Meanwhile, vtRNA1-2 (Ave. 2.9, 1.6 SD), vtRNA1-3 (Ave. 1.7, 1.2 SD) and vtRNA2-1 (Ave. 2.5, 1.4 SD) have broader ranges of ATAC-seq and DNA methylation values ( Figure 2, Table 1, Extended Data: Tables S1 and S2 and Figure S1). A comparison among the vtRNAs shows that the average chromatin accessibility of their promoters is inversely proportional to their average DNA methylation (Table 1 and Figure 2) (relative ATAC-seq values are vtRNA1-1 (4.1) > vtRNA1-2 (2.9) > vtRNA2-1 (2.5) and accordingly, relative DNA methylation are vtRNA1-1 (0.09) < vtRNA1-2 (0.4) < vtRNA2-1 (0.5)). Yet vtRNA1-3 average low chromatin accessibility is not accompanied by a comparatively denser promoter methylation (values 1.7 and 0.22 respectively). Nevertheless, the intra-sample correlation between the ATAC-seq and DNA methylation for the vtRNAs is between -0.24 and -0.78 (see rs-values in Table 1 and Extended Data: Figure S2), suggesting that regardless of the relatively smaller average effect of DNA methylation on vtRNA1-3 promoter accessibility, all the vtRNAs are regulated by DNA methylation to some extent ( Table 1). In addition, the CpG DNA methylation negatively correlates with ATAC-seq values in the 500bp of the promoters of the four vtRNAs (Table 1 and Extended Data: Figure S2). The strength of the correlation for each vtRNA is higher for the two vtRNAs that have a broader spectrum of DNA methylation, i.e. vtRNA1-2 and vtRNA2-1 (rs = -0.74 and rs = -0.78 respectively) ( Table 1 and Extended Data: Figure S2B and S2D). Meanwhile, vtRNA1-1 and vtRNA1-3 promoters, which have narrower range of DNA methylation (vtRNA1-1 has also a small range of chromatin accessibility) present a smaller association between both values (rs = -0.24 and rs = -0.54 respectively) ( Table 1 and Extended Data: Figure S2A and S2C). Technical differences in ATAC-seq and DNA methylation array approaches, as well as naturally occurring non-linear correlation between average DNA methylation and chromatin structure may account for these differences (Corces et al., 2018;Liu et al., 2018). The low ATAC-seq value of the pseudogenes vtRNA3-1P (Ave. -1.2, 0.6 SD) and vtRNA2-2P (Ave. 0.1, 0.6 SD), represents a proof of concept of the analyses (Extended Data: Figure S3 and Tables S1-S5). The same approach was performed for the Y-RNAs genes RNY1 (Ave. 3.7, 0.4 SD), RNY3 (Ave. 2.1, 0.6 SD, RNY4 (Ave. 1.9, 0.6 SD) and RNY5 (Ave. 3.1, 0.4 SD) (Canella et al., 2010), which code for ubiquitously expressed RNA polymerase III transcripts (Extended Data: Figure S3 and Tables S1).
Chromatin accessibility, DNA methylation and transcription factor binding at the four vtRNA promoters are correlated Aiming to investigate a possible transcriptional co-regulation of vtRNA genes, we determined the pairwise correlations in chromatin accessibility and DNA methylation among the four vtRNA promoters. We found that all pairs of vtRNAs, except vtRNA1-2 and vtRNA2-1, show positive correlations in both datasets ( Figure 3A and 3B) (rs-values 0.23-0.41 for ATACseq and rs-values 0.20-0.50 for DNA methylation). The unsupervised clustering of these epigenetic marks indicates that the three vtRNAs clustered at vtRNA1 locus (vtRNA1-1, vtRNA1-2 and vtRNA1-3), which are the vault particle associated vtRNAs, are more similar than vtRNA2-1 ( Figure 3A and 3B).
Given that the chromatin accessibility of a DNA region is partially controlled by transcription and chromatin remodeler's factors (Corces et al., 2018;Klemm et al., 2019), we analyzed transcription factors (TFs) occupancy at a ± 3000 bp region centered at each vtRNA gene. Since there is no such study of the TCGA samples, we investigated the CHIP-seq data of the K562 chronic myelogenous leukemia cell line, which has the richest TF data at ENCODE dataset (Euskirchen et al., 2007;Goldman et al., 2013). We observed that the four vtRNAs share a common core of 23 TFs (26%). The three members of the vtRNA1 cluster share 16 additional TFs (18%), reaching 39 common TFs (44%), while only 2-4 TFs with vtRNA2-1 ( Figure 3C and 3D and Extended Data: Table S6). We finally asked whether this common core of TFs is linked to a specific biological process. The STRING enrichment analysis of the core of 23 TFs common to all vtRNAs identified HTLV-I infection (ATF1, ATF3, E2F1, EP300, NFATC3, TBP), Hepatitis B (E2F1, EP300, NFATC3) and Transcriptional misregulation in cancer (AFF1, ATF1, MAX) as the top three enriched KEGG pathway terms (FDR < 0.05) ( Figure 3D and Extended Data: Table S6).
Taken together these findings indicate that the transcriptional activity of the vtRNAs is gene specific, being vtRNA1-1 the most accessible and possibly the most expressed. In addition, the data suggest that the regulatory status of the vtRNA promoters could be coordinately controlled, and the three vtRNA1s are more co-regulated among themselves. Yet, the core of TFs shared by the four vtRNAs is associated with viral infection and cancer related terms in the myeloid cell line K562.
Using the vtRNAs promoter DNA methylation data, we extended the analysis to 32 primary tumor tissues, incorporating nine more tissues than ATAC-seq samples (DLBC, KICH, OV, PAAD, READ, SARC, THYM, UCS, UVM) ( Figure 4B, Extended Data: Figure S1) and 16 normal adjacent tissues ( Figure 4C, Extended Data: Figure S1). Again, the DNA methylation profile mirrors the chromatin accessibility in the individual tissue types ( Figure 4). Remarkably, the association between the average vtRNA promoter chromatin accessibility and DNA methylation of the vtRNA1 cluster in each tissue type is higher than that observed for the average of all tissues (vtRNA1-1 rs = -0.28, vtRNA1-2 rs = -0.90 vtRNA1-3 rs = -0.74 and vtRNA2-1 rs = -0.58) ( Table 1 and Extended Data: Figure S4). The opposite finding for vtRNA2-1 may be explained by the impact of the previously recognized epigenetic variation in the locus (Silver et al., 2015), which may prevail over the tissue specific variation for this gene.
VtRNA promoter´s DNA methylation is associated with tumor stage and tissue of origin The assessment of differential vtRNA expression from normal to tumor condition at the TCGA cohort can only be performed using DNA methylation dataset since no ATAC-seq analyses were performed in the normal tissues. The average beta-value of CpG sites in the 500bp vtRNAs promoter of normal and tumor tissue evidenced gene specific patterns of deregulation in neoplastic tissues ( Figure 5). As depicted in Figure 4, vtRNA1-1 and vtRNA1-3 promoter DNA methylation and chromatin accessibility is low and high tissue-wide respectively, in both normal and tumor tissues (yet, there are highly methylated tumor outliers as LGG, that lack normal counterpart) ( Figure 4 and Figures 5A and 5C). Conversely, vtRNA1-2 and vtRNA2-1 revealed significant differences in average promoter DNA methylation among the tissue categories ( Figures 5B and 5D). In agreement with previous results, the methylation of vtRNA1-2 promoter decreases from normal to tumor samples, while vtRNA2-1´s increases ( Figures 5B and 5D and Figure 4).
We next asked whether the promoter status of the vtRNAs is deregulated during the normal and tumor transformation of different tissue types. This comparison was restricted to the 16 tissues with normal samples (with data of at least five samples, Figure 6 and Extended Data: Figure S1, Table S5). The results revealed that vtRNA1-1 is globally unmethylated in normal and tumor from different tissue origins, but presents small but statistically significant changes in 4 tissue types (OG profile in BLCA, THCA UCEC and TSG in LUSC), whose biological impact remains to be determined ( Figure 6A). Similarly, vtRNA1-3 is globally unmethylated, showing larger variability and significant upregulation in BRCA and PRAD, compatible with a TSG function (TSG trends for LIHC p-value = 0.08 and LUSC p-value = 0.09, Figure 6C). In agreement with the literature, vtRNA2-1 presents a TSG pattern of deregulation in PRAD, LUSC and BRCA (Cao et al., 2013;Fort et al., 2018;Romanelli et al., 2014), and an OG pattern in KIRP that has not been previously described ( Figure 6D). Finally, a statistically significant dysregulation of vtRNA1-2 promoter DNA methylation across almost all cancer types (13 out of 16 tissues analyzed) poses it as a candidate OG in cancer ( Figure 6B).
an OG like epigenetic de-repression while vtRNA2-1 (and to a lesser extent vtRNA1-1, vtRNA1-3) has a TSG like repressive methylation in tumor tissues.
Low DNA methylation at the promoter of vtRNA1-1, vtRNA1-2 and vtRNA2-1 is associated with low patient overall survival Seeking for a possible clinical significance of the chromatin accessibility change of the vtRNA promoters, we studied its association with the overall survival of patients (Figure 7 and Extended Data : Tables S7 and S8). The analysis of patients stratified by vtRNA promoter accessibility quartiles did not show differences in overall survival (data available at Extended Data: Table S8). However, a significantly lower patient survival probability when patient tumors have lower promoter DNA methylation is observed for vtRNA1-1 (p-value = 0.003), vtRNA1-2 (p-value = 0.004) and vtRNA2-1 (p-value = 0.002) (patient stratified in quartiles of the promoter DNA methylation cohort values, Figure 7). Therefore, a lower promoter DNA methylation of vtRNA1-1, vtRNA1-2 and vtRNA2-1, a surrogate of their high expression, might be associated with poor patient overall survival cancer-wide.
Genome-wide chromatin accessibility correlations with vtRNA promoters reveal their link to specific cancer related functions Seeking to get insight into the function of the vtRNAs from their transcription regulatory marks, we searched for the genes co-regulated at chromatin level. We calculated the correlation of the ATAC-seq promoter values of each vtRNA and all the individual genes in the 385 primary tumor cohort and selected those showing a Spearman correlation rs ≥ 0.4. Using STRING software analysis, we then investigated their connection with Biological Process categories and KEGG pathways (Szklarczyk et al., 2017). A few enriched terms for the genes correlated with vtRNA1-2, vtRNA1-3 and vtRNA2-1 is identified (FDR < 1×10 -3 ) (Extended Data: Tables S9 and S10). The 196 protein coding genes correlated with vtRNA1-2 are enriched in the Biological Process "Epithelial cell differentiation" (FDR < 1×10 -4 ) (Extended Data: Table S10). Meanwhile, the 358 protein coding genes correlated with vtRNA2-1 are enriched in the Biological Processes "Immune system process", "Inflammatory response" and "Immune response" (FDR < 1×10 -5 ) and the KEGG pathways term "Cytokine-cytokine receptor interaction" (FDR < 1×10 -4 ) (Extended Data: Table S10). The only six protein coding genes correlated with vtRNA1-3 are enriched in the "Thyroid cancer" KEGG pathway term (FDR < 1×10 -4 ).
Remarkably, although STRING analysis of the vtRNA1-1 associated 37 protein coding genes did not find enriched process or pathways, 36 of the 37 correlated genes are located at chromosome 5q region. Using the Cluster Locator algorithm (Pazos Obregón et al., 2018), we found a statistically significant non-random clustering behavior for the chromosomal location of the genes co-regulated with vtRNA1-1 (p-value < 1×10 -10 ). Indeed, when the analysis was extended to 173 protein coding genes with Spearman correlation rs ≥ 0.3, 139 genes of 173 were situated at chromosome 5q region (p-value < 1×10 -10 ). These genes are positioned in particular regions at chr5q31, chr5q35, chr5q23, chr5q14, chr5q32, chr5q34 (FDR < 1×10 -4 ) (Kuleshov et al., 2016) (Extended Data: Table  S10). We did not find a similar phenomenon for the other three vtRNAs. In summary, vtRNA1-3, vtRNA1-2 and vtRNA2-1 transcription may be co-regulated with genes belonging to specific biological processes located elsewhere, while vtRNA1-1 transcription may be controlled by a locally specialized chromatin domain at chromosome 5q region lacking apparent functional relatedness.
Immune Subtypes associated with the vtRNAs profile Taking into account that vtRNAs have been previously related to the immune response (Golec et al., 2019;Li et al., 2015;Nandy et al., 2009), and that we found an association of vtRNA2-1 with immune related terms, we sought to investigate a putative relation between the vtRNAs expression and the six Immune Subtypes defined by Thorsson et al. (Thorsson et al., 2018), compiled at Pan-Cancer TCGA (Extended Data: Tables S11-S14). These six Immune Subtypes are named because of the foremost immune characteristic, comprising wound healing, IFN-g dominant, lymphocyte depleted, inflammatory, immunologically quiet and TGF-b dominant types (Thorsson et al., 2018). As was expected, mirrored patterns were observed for promoter ATAC-seq and DNA methylation data of the six groups ( Figure 8). The immunologically quiet subtype shows a concerted shutdown of all vtRNA promoters (Figure 8), which is largely composed by LGG samples showing low vtRNA ATAC-seq and high promoter methylation average values in LGG tumors (Figures 4A and 4B). The immunological quiet subtype exhibits the lowest leukocyte fraction (very low Th17 and low Th2) and the highest macrophage:lymphocyte ratio (high M2 macrophages) (Thorsson et al., 2018). The remaining subtypes present a similar vtRNA profile except for the inversion of vtRNA1-2/vtRNA2-1 accessibility ratio, which is higher than 1 for the wound healing and IFN-g dominant subtypes while is lower than 1 for the lymphocyte depleted and inflammatory subtypes ( Figure 8A). As expected, a mirrored pattern was observed for vtRNA1-2 and vtRNA2-1 at DNA methylation data ( Figure 8B). Since the wound healing and IFN-g dominant subtype have high proliferation rate opposite to the lymphocyte depleted and inflammatory subtypes (Thorsson et al., 2018), we evaluated the correlation between vtRNA1-2 and vtRNA2-1 chromatin status and proliferation rate or wound healing status defined by Thorsson et al. (Extended Data: Tables S12 and S14) (Thorsson et al., 2018). A negative correlation between vtRNA1-2 promoter DNA methylation and the proliferation rate (rs = -0.44) and wound-healing features of the tumors is revealed (rs = -0.48) (Extended Data: Tables S12 and S14). These findings suggest that vtRNA transcriptional regulation may be associated with tumor immunity.

Discussion
Although important insight into the vtRNA function has been recently gained, the landscape of vtRNA expression in human tissues and cancer was unknown, perhaps for the absence of vtRNA data in transcriptomic studies. Here we used ATAC-seq and methylation arrays data from the Pan-Cancer TCGA consortium (Weinstein et al., 2013), as surrogate variables to uncover the expression of vtRNAs across multiple cancer types. A limitation of our study is its blindness to post-transcriptional controls of vtRNA expression, such as RNA processing, transport, localization and/or stability. Indeed, there is evidence of vtRNA transcript regulation via RNA methylation by NSUN2, or via its association to other proteins like DUSP11, DIS3L2, SSB, SRSF2 and DICER (Burke et al., 2016;Hussain et al., 2013;Kickhoefer et al., 2002;Łabno et al., 2016;Persson et al., 2009;Sajini et al., 2019). Nonetheless, the chromatin status of the vtRNA promoters has been correlated with the expression of their transcripts in several reports (Ahn et al., 2018;Fort et al., 2018;Helbo et al., 2015;Helbo et al., 2017;Lee et al., 2014a;Lee et al., 2014b;Sallustio et al., 2016;Treppendahl et al., 2012). These evidences unequivocally stand for the use of chromatin structure as a suitable proxy for vtRNA expression.
An integrated genomic view of the regulatory features of the vtRNA genes compiled in UCSC genome browser indicates that the four vtRNAs have chromatin signs of active transcription in different cell lines and defines a proximal promoter regulatory region of approximately 500 bp (Nikitina et al., 2011;Vilalta et al., 1994  variation in all the evaluated tissues, which is consistent with its identity as the ancestor vtRNA gene and its significance as the major RNA component of the ubiquitous vault particle (Stadler et al., 2009;van Zon et al., 2001). In addition, vtRNA1-2, vtRNA1-3 and vtRNA2-1 have lower chromatin accessibility at their promoters and a larger variation among the samples. As expected, the average DNA methylation assessed by microarrays is negatively correlated with the chromatin accessibility assessed by ATAC-seq. Therefore, the landscape of vtRNA chromatin status was corroborated by both approaches. Yet, the low chromatin accessibility of vtRNA1-3 promoter is not reflected by a concomitant high DNA methylation, suggesting that additional regulatory features of this gene, acting in trans or cis, play a major influence in its chromatin structure. As was reported by Helbo et al. Seeking to investigate regulatory connections between the vtRNAs as a valuable tool to study their function, we determine the pairwise correlations between the promoters of the vtRNAs. The positive correlations found are compatible with a coordinated transcriptional control of the vtRNAs, that is greater for the three members of the vtRNA1 cluster (specially vtRNA1-1 and vtRNA1-3). The exception is the pair vtRNA1-2 and vtRNA2-1, whose chromatin accessibility and promoter methylation are not linked in the cohort and can be indeed negatively associated.
As an alternative approach to investigate the vtRNA transcriptional co-regulation, we analyzed transcription and chromatin remodeling factors associated with their proximal promoter region. It is worth to mention that RNA polymerase II TFs clustered near TSS of RNA polymerase III transcripts could modulate occupancy and transcription rate of RNA polymerase III through RNA pol II activity (Campbell & White, 2014;Gerber et al., 2020;Raha et al., 2010;Zhang et al., 2019). Specifically, RNA polymerase II could assist chromatin opening, thus allowing accessibility of RNA polymerase III or its associated factors in the human genome (Raha et al., 2010). Interestingly, the 23 TFs common to all vtRNAs are related to viral infections and cancer. In agreement with their higher chromatin correlation, the vtRNA1 clustered genes have a larger core of common TFs (39) in comparison with vtRNA2-1 (23 TFs). The divergence of the transcriptional control of vtRNA2-1 was previously recognized given its unique regulatory elements and the complex developmental methylation it undergoes (Canella et al., 2010;Carpenter et al., 2018;Helbo et al., 2017;Nandy et al., 2009). The upregulation of the four vtRNAs upon viral infection has been reported for several viruses, so it is possible that this core of TFs participate in the coordinated expression of vtRNA during viral infection and related responses (Amort et al., 2015;Li et al., 2015;Nandy et al., 2009). These TFs regulate cell cycle and translational arrest as well as the inhibition of host innate immune response (Attar & Kurdistani, 2017;Ertosun et al., 2016;Horsley & Pavlath, 2002;Jean et al., 2000;Persengiev & Green, 2003;Rohini et al., 2018). Since these processes are also hallmarks of cancer, the dysregulation of the normal function of vtRNAs in viral response may provide a selective advantage for cancer cell to overcome the many sources of stress faced during neoplastic evolution. If that holds true, the core vtRNA TFs may be co-opted in the malignant context to tune vtRNA expression favoring tumor progression. In agreement with that hypothesis, ATF1 and ATF3 were associated with CREB response and increased cell viability (Persengiev & Green, 2003). Likewise, Golec et al. showed that altered levels of vtRNA2-1 modulate PKR activation and consequently altered the levels of CREB phosphorylation during T cell activation, which is a prerequisite for IFN-g activity (Golec et al., 2019). Furthermore, E2F1, a cell cycle regulator, was described as a direct transcriptional regulator of vtRNA2-1 in cervical cancer cells (Li et al., 2017). Moreover, it was shown that MYC binding to vtRNA2-1 promoter raises its expression and could be the explanation for the increased levels of vtRNA2-1 in some tumors (Park et al., 2017a). Similarly, TGFB1 provokes the demethylation of vtRNA2-1 promoter and consequently increases its expression in ovarian cancer (Ahn et al., 2018).
Alternatively, since RNA polymerase III transcripts are involved in virus replication and immune response (Graczyk et al., 2015), it is possible that the TF bound to vtRNA vicinity are part of a global and not a vtRNA specific viral response.
Due to its importance for cancer biology, vtRNA2-1 transcriptional regulation was recently more investigated (Yeganeh & Hernandez, 2020). Likewise, to our knowledge, the global landscape of vtRNA expression across different cancer types has not been investigated. Our study shows that, except for vtRNA1-1, which has a high promoter chromatin accessibility and low DNA methylation in all the tissues, the other vtRNA promoters exhibit different levels of chromatin accessibility and DNA methylation among the tissue types. The relative status of the chromatin in different tissues is accurately mirrored by both approaches in the tissues examined by ATAC-seq and Illumina DNA methylation arrays. Interestingly, the same analysis performed in the normal tissue samples reveals that the tissue specific variation in DNA methylation of vtRNA promoters in the neoplastic tissues is already established in their normal tissue counterparts, indicating that the regulation of vtRNA expression is tuned during normal development and cell differentiation.
While there is no apparent concerted modulation of the chromatin accessibility of the vtRNA promoters in most tissues, ACC and LGG tumors seem to be the exception. Concordantly, a global hypomethylation of malignant ACC tumors (Legendre et al., 2016;Rechache et al., 2012), and an aberrantly methylation processes associated to altered DNMTs activity in LGG tumors were reported (Nomura et al., 2019). However, more research is necessary to understand the molecular basis of these observations.
The differential expression of the vtRNAs from normal to primary tumor tissues revealed different patterns depending on tissue origin. The average DNA methylation of the vtRNA1-1 and vtRNA1-3 promoters is not significantly deregulated, whereas vtRNA1-2 and vtRNA2-1 are decreased and increased respectively in tumor tissues. The latter finding favors a candidate OG and TSG function of vtRNA1-2 and vtRNA2-1 in cancer, respectively. Indeed, the epigenetic repression of vtRNA2-1 was previously reported by Romanelli et al. in BLCA, BRCA, COAD and LUSC analyzing the same data from TCGA (Romanelli et al., 2014). Furthermore, functional studies in LUSC, PRAD, ESCA and AML provided experimental support to that hypothesis (Cao et al., 2013;Fort et al., 2018;Lee et al., 2014a;Treppendahl et al., 2012). Yet, vtRNA2-1 has been also proposed as an OG in ovarian, thyroid, endometrial, cervical and renal cancer (Ahn et al., 2018;Hu et al., 2017;Lee et al., 2016;Lei et al., 2017;Li et al., 2017;Yeganeh & Hernandez, 2020). In agreement with the functional data found by Lei et al. (Lei et al., 2017), we found an epigenetic de-repression of vtRNA2-1 promoter DNA methylation in renal carcinoma (KIRP). Likewise, a closer look at THCA samples, shows an average decrease in the DNA methylation of vtRNA2-1 promoter that supports the OG function described in this tissue (Lee et al., 2016). Unfortunately, normal ovarian, cervical and endometrial tissues are not available at the TCGA. Finally, vtRNA1-2 has an oncogenic pattern of inferred expression between normal and tumor tissues for 13 of 16 tissues analyzed, raising a possible oncogenic function for this RNA.
Our study found also an association of DNA methylation at the promoter with a shorter overall survival for vtRNA1-1, vtRNA1-2 and vtRNA2-1. This is surprising for vtRNA2-1, since it has both TSG and OG roles and TSG compatible expression profiles in three cancer types, inferred by DNA methylation of its promoter in tumor vs normal tissues and the average promoter DNA methylation is increased in tumor compared to normal tissue. A higher impact of vtRNA2-1 expression in patient survival in the oncogenic context may justify this discrepancy. On the contrary, the association between vtRNA1-2 low promoter DNA methylation and low patient survival is in agreement with its OG expression in cancer. The lack of survival association with the ATAC-seq values may be due to the small number of patients studied by this approach.
VtRNA expression in individual cancer types, inferred from the epigenetic status of their promoters, has been associated with patient survival in several studies of vtRNA2-1, one of vtRNA1-3 and none for vtRNA1-1 or vtRNA1-2. Low methylation or high expression of vtRNA2-1 promoter were associated with good prognosis or overall survival in lung (Cao et al., 2013), esophageal (Lee et al., 2014a), prostate (Fort et al., 2018, AML (Treppendahl et al., 2012), gastric (Lee et al., 2014b and liver (Yu et al., 2020). Conversely, a worse prognosis or overall survival association of vtRNA2-1 was reported in thyroid (Lee et al., 2016) and ovarian (Ahn et al., 2018). From these reports, only prostate (Fort et al., 2018) and gastric (Lee et al., 2014b) studies used the TCGA data. Likewise, the methylation status of the vtRNA1-3 promoter associates with overall survival in the lower risk Myelodysplastic Syndrome patients (Helbo et al., 2015). Additionally, vtRNA1-1 and vtRNA1-2 correlate to chemotherapeutic resistance by direct interaction with drugs (doxorubicin, etoposide and mitoxantrone) and the modulation of vtRNA1-1 confirmed this finding in osteosarcoma cell lines (Gopinath et al., 2005;Gopinath et al., 2010;Mashima et al., 2008;van Zon et al., 2001). Furthermore, increased levels of vtRNA1-1 were associated with increased proliferation and chemoresistance due to GAGE6 induction in MCF-7 cells (Chen et al., 2018). Besides, Norbert Polacek group showed that vtRNA1-1 expression confers apoptosis resistance in several human cell lines (BL2, BL41, HS578T, HEK293, A549 and HeLa) and revealed it capacity to repressed intrinsic and extrinsic apoptosis pathway (Amort et al., 2015;Bracher et al., 2020). The later findings are in agreement with the lower patient survival associated with high levels of vtRNA1-1 expression cancer-wide observed in our analysis.
Seeking to get insights into the cancer related function of the vtRNAs we performed a genome wide search for genes coregulated at the level of promoter chromatin accessibility. Remarkably, we identified immune and cytokine related terms for the genes co-regulated with vtRNA2-1. This association agrees with its proposed roles in innate immune modulation via PKR repression or OAS1 regulation and in cytokine production (Calderon & Conn, 2018;Golec et al., 2019;Lee et al., 2020;Li et al., 2015). Furthermore, vtRNA2-1 has been associated with autoimmune disorders (Renauer et al., 2015;Weeding & Sawalha, 2018) and tumor engraftment in prostate cancer (Ma et al., 2020). This enrichment in viral infection pathways, together with the viral infection involvement of the core TFs common to all vtRNAs, reinforces the hypothesis of the re-utilization of a regulatory RNA induced upon viral response in favor of cancer development at upstream and downstream regulatory steps. Additionally, we found a non-random clustering localization at chromosome arm 5q for genes co-regulated with vtRNA1-1. The chromosome arm 5q and regions chr5q31 and chr5q32-q33 were found frequently deleted in myelodysplastic syndrome (MDS) and acute myeloid leukemia (AML) (Fuchs, 2012;Treppendahl et al., 2012). Furthermore, the modulation of vtRNA2-1 and vtRNA1-3 was previously associated with human bone marrow CD34+ cells, AML and MDS (Helbo et al., 2015;Treppendahl et al., 2012). Yet, our analysis points to vtRNA2-1 and vtRNA1-2, and to a lesser extent to vtRNA1-1 and vtRNA1-3, as the most relevant drivers across cancer types.
Support for the involvement of the vtRNAs in the immune responses in the context of cancer came from the association of their chromatin status with the immune subtype categories defined by Thorsson et al. (Thorsson et al., 2018). Our findings suggest a possible role of the vtRNAs in lymphocyte response in tumor samples, since the immunologically quiet subtype shows the lowest leukocyte fraction and vtRNAs promoter chromatin accessibility (Thorsson et al., 2018). Meanwhile, the vtRNA2-1 and vtRNA1-2 expression distinguishes two groups among the remaining four subtypes. The participation of vtRNA2-1 in the induction of the IFN-g and IL-2 expression in activated T cells through PKR modulation (Golec et al., 2019) goes in agreement with this hypothesis. Although the same study demonstrated that vtRNA1-1 expression was unchanged during T cell activation, it is remarkable that the 5q31-q33 region, comprising the vtRNA1 locus and chromatin co-regulated genes, encodes several cytokines that regulate the differentiation of Th1 and Th2 lymphocytes and has been linked with susceptibility to infections (Jeronimo et al., 2007;Lacy et al., 2000;Naka et al., 2009;Rodrigues et al., 1999). Furthermore, vtRNA2-1 levels were recently associated with macrophages M1/M2 fates in prostate PC3 cell line mice xenografts via TGF-b (Ma et al., 2020), two pathways that define the six immune subtypes (Thorsson et al., 2018). Finally, we found that the proliferation rate and wound healing of the tumors are associated with the levels of vtRNA1-2 chromatin accessibility, which reinforces an OG role of vtRNA1-2 in cancer. Nonetheless, the participation of the vtRNAs in immune cells response inside the neoplastic niche needs to be further investigated.

Conclusion
Taken together our analysis reveals the pattern of chromatin accessibility and DNA methylation at the four vtRNA promoters, analyzed by tissue of origin in the TCGA cohort. The agreement between both dataset and previous related literature endorses the use of these variables as surrogates of the vtRNA transcripts expression. VtRNA genes are enriched in viral related TFs. The comparison of TFs and chromatin status of the vtRNAs suggest that vtRNA2-1 is likely the most independently regulated vtRNA locus. The pattern of chromatin accessibility and DNA promoter methylation in normal vs tumor tissue and the association with cancer cell pathways and patient survival suggest that vtRNA2-1 and vtRNA1-2 are possibly the more relevant contributors to cancer. The results favor tissue specific TSG and OG roles for vtRNA2-1, a still not investigated oncogenic role of vtRNA1-2 and a limited cancer driver effect of vtRNA1-1/3 in specific tissue types. Lastly, we uncovered new evidence linking the vtRNAs with the immune response, cell proliferation and overall survival in cancer, which guarantees further investigation.

Data availability
Underlying data All raw data used in this study can be downloaded from Xena Browser, a tool to explore functional genomic data sets • Figure S2. Correlation between ATAC-seq values and promoter methylation values of vtRNAs in Pan-Cancer TCGA dataset. A-D. The correlation between ATAC-seq and DNA methylation average normalized beta values was calculated for vtRNAs (vtRNAs1-1 (A), vtRNA1-2 (B), vtRNA1-3 (C) and vtRNA2-1 (D)), in 329 primary tumors samples. The box plots show the median line and lower and upper quartile and the whiskers the 2.5 and 97.5 percentile. Horizontal grey striped and red dotted lines are shown to denote unmethylated (average beta-value ≤ 0.2), 50% methylated (average beta-value = 0.5) and highly methylated (average beta-value ≥ 0.8) levels. Spearman correlation value and the best-fit line (red line) with 95% confidence bands (black dot lines) are shown.
• Figure S3. This project contains the following extended data: • Tables S1-S5. ATAC-seq and DNA methylation data for vtRNA promoters in primary tumors and normal adjacent tissues. CSV spreadsheets: Table_S1_ATAC-seq_data_500bp: All ATAC-seq data of vtRNAs promoter (500bp) data for primary tumor samples; Table_S2_DNA_methylation_500bp: All DNA methylation data and ATAC-seq data of vtRNAs promoter (500bp) data for total primary tumor and normal adjacent samples; Table_S3_DNA_ methylation_NORMAL: All DNA methylation data of vtRNAs promoter (500bp) data for normal adjacent samples; Table_S4_DNA_methylation_TUMOR: All DNA methylation data of vtRNAs promoter (500bp) data for primary tumor samples; Table_S5_Normal_ &_Tumor_matched: All DNA methylation data of vtRNAs promoter (500bp) data for primary tumor and normal adjacent samples.
• Table S6. VtRNAs Transcription Factor Binding and KEEG enriched terms. CSV spreadsheets: Table_S6_Binding_Factors: Transcription factors identified in the cell line K562 as ChIP-seq Peaks by ENCODE 3 project and KEEG_terms: enriched KEGG pathway terms (FDR < 0.05).
• Tables S7-S8. DNA methylation, ATAC-seq data and associated survival data for primary tumors.

CSV spreadsheets:
Table_S7_DNA-methylation_ Survival_data: All DNA methylation data of vtRNAs promoter (500bp) and survival data for primary tumor samples; Table_S8_ATAC-seq_Survival_data: ATACseq data of vtRNAs promoter (500bp) and survival data for primary tumor samples. • Tables S11-S14. ATAC-seq and DNA methylation data for vtRNA promoters in primary tumors and the associated Immune Subtypes data. CSV spreadsheets: Table_S11_Immune_Subtypes_DNA_methylation_ data: All DNA methylation data of vtRNAs promoter (500bp) and Immune Subtypes data for primary tumor samples; Table_S12_Spearman_corr_vtRNAs_Immune_ Subtypes_DNA_methylation_data: Spearman correlation values of DNA methylation data of vtRNAs promoter (500bp) and Immune Subtypes data for primary tumor samples; Table_S13_Immune_Subtypes_ATAC-seq_ data: All ATAC-seq data of vtRNAs promoter (500bp) and Immune Subtypes data for primary tumor samples; Table_S14_Spearman_corr_vtRNAs_Immune_Sub-types_ATAC-seq_data: Spearman correlation values of ATAC-seq data of vtRNAs promoter (500bp) and Immune Subtypes data for primary tumor samples.
Data are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).
Fort RS, Duhagon MA: Pan-Cancer chromatin analysis of the human vtRNA recently, a growingly number of studies has begun to report interesting roles of vtRNAs, most of which are independent of the vault complex. These days, next generation sequencing techniques have provided a huge amount of information on gene expression profiles in diverse biological contexts. However, as the authors indicated, vtRNAs have been out of this benefit, because of their sizes and, allegedly by the authors, their secondary structures. So, the authors turned into ATAC-seq data and CpG methylation data to infer their expression levels and analyzed them in several different aspects.
The best of all, the value of this manuscript lies in the fact that it has provided comprehensive information about vtRNAs. This manuscript is useful, albeit not a breakthrough paper, for vtRNA researchers. Overall, this manuscript is well qualified for indexing. However, incorporation of my points below would improve this manuscript, I hope.

Major points
The manuscript is too lengthy and I recommend succinct description. Especially, the Discussion part is mostly repetition of what has been already stated in the Results section. 1.
Page 4, "VtRNA2-1 has also been shown to act as a microRNA precursor in different tissues, serving both as TSG…": this description is followed by citation of many references. However, most of them simply deemed vtRNA2-1 to be miR-886-5p or -3p to go for subsequent experiments; for example, miRNA mimic transfection, target prediction, etc. All those experiments and analyses were nonsense, without prudent validation whether vtRNA2-1 generates miRNAs. The authors should mention this point and eliminate all references, except those that have provided experimental evidence for the presence of miRNA miR-886-5p or -3p.

2.
Page 15, "there is evidence of vtRNA transcript regulation via RNA methylation by NSUN2, or via its association to other proteins like DUSP11, DIS3L2, SSB, SRSF2 and DICER": regarding vtRNA2-1 regulation by Dicer, this paper "nc886 is induced by TGF-β and suppresses the microRNA pathway in ovarian cancer" 1 should be included.

3.
Page 16, "…Alternatively, the methods used to quantify the vtRNAs are not comparable among the studies": I believe that the levels of the four vtRNAs could not be compared among one another even within one study, unless they included a standard to normalize variation in efficiency between Northern probes or between qRT-PCR primers. This caveat must be mentioned. In addition, the authors should check each of the references (whether they did measurement with a standard) and cite selectively.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Yes © 2021 Hasler D. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Daniele Hasler
Biochemistry Center Regensburg (BZR), Laboratory for RNA Biology, University of Regensburg, Regensburg, Germany The study by Fort and Duhagon estimates the expression of a class of non-coding (nc)RNAs termed 'vault RNAs' (vtRNAs) across different cancer types. Since vtRNAs might be underrepresented applying conventional RNA sequencing techniques, the authors infer vtRNAs expression by analyzing chromatin characteristics at promoter regions of vtRNA genes. In more detail, Fort and Duhagon use genome-wide chromatin accessibility data (ATAC-seq) as well as DNA methylation data sets from The Cancer Genome Atlas (TCGA) consortium to provide a comprehensive comparison between the four annotated vtRNA genes. By this approach, the authors determine the correlation between chromatin accessibility and DNA methylation of vtRNA promoters and compare their chromatin characteristics across different cancer types. Furthermore, Fort and Duhagon use publically available CHIP-seq datasets originating from a leukemia cell line to determine transcription factors (TFs) binding to vtRNA genes and identify a set of 23 core TFs common to all vtRNAs which might govern chromatin accessibility at these genomic sites.
Further analyses of the TCGA DNA methylation data sets provide evidence for vtRNA-specific deregulation upon malignant transformation in specific cancer types, as well as lower patient survival probability associated with reduced vtRNA promoter methylation. Finally, the authors determine genes that are co-regulated with vtRNAs in terms of chromatin accessibility and the also profile chromatin accessibility and DNA methylation of vtRNA promoters in six immune subtypes of the pan-cancer TCGA datasets.
VtRNAs are a class of ncRNAs with rather enigmatic functions, albeit they have been recently implicated in important cellular processes 1,2 and in cancer biology [ref-]. The work by Fort and Duhagon is the first pan-cancer study focusing on vtRNAs and it provides a nice and comprehensive overview for the potential function of the single vtRNA transcripts as tumor suppressors or oncogenes in specific cancer types.
The manuscript is well written, although some aspects could have been presented a bit more concisely. In particular, the authors dedicate special emphasis to the relative comparison between vtRNA genes (e.g. Figures 3A and 3B and result section related to Figures 4 and 8). The question arises whether a key message of the study is that specific vtRNA genes are up-or down-regulated in certain cancer conditions or whether a key point is that the relative expression levels of the four vtRNA genes are altered. If so, the authors should propose a model in the discussion section.

Additional major points:
The major limitation of this study is that vtRNA expression was not analyzed directly but was inferred from chromatin accessibility data (ATAC-seq) as well as DNA methylation data. Nevertheless, the authors honestly point out this limitation and provide convincing evidences from literature that support the validity of their study. Taken together, this approach sounds reasonable. Nevertheless, the findings could be further strengthened by analyzing TCGA miRNA-seq datasets for the expression of vtRNA-derived small RNA fragments. This strategy has been applied in previous studies to infer the expression of snoRNAs 4 and tRNAs 5 across different cancer types, assuming that small processing products correlate with expression levels of the full-length transcripts. Indeed snoRNAs and in particular tRNAs are ncRNA with similar sequencing biases as vtRNAs.
Furthermore, since a small RNA derived from vtRNA-2-1 has been linked to different cancers (see references within the manuscript), this additional pan-cancer analysis would be informative not only concerning this specific fragment but might also uncover the relevance of additional vtRNAderived small RNAs.
The analysis of TF occupancy at vtRNA loci is, in my opinion, rather speculative and vague. Since vtRNAs are transcribed by RNA polymerase III (pol III), is there any clear and direct evidence from literature that the identified TFs regulate Pol III activity (except for the TFIIIA, TFIIIB and TFIIIC complexes)? Could the binding of these TFs rather regulate transcription of adjacent RNA polymerase II genes and not of vtRNAs? The authors might at least briefly touch these aspects in the discussion.
It would be also worth to determine whether deregulation (e.g. of any of the 9 vtRNA1-2-specific TFs or any of the 4 vtRNA2-1-specific TFs) correlates with the changes of vtRNA promoter DNA methylation in normal vs tumor samples ( Figure 6). This would be more informative than the current conclusions concerning shared TFs between vtRNA loci and their co-regulated expression. Indeed, this analysis might help to understand, e.g., how the expression of a single vtRNA locus is regulated within the vtRNA1 cluster.
Interestingly, the chromatin profile of vtRNAs in brain lower grade glioma (LGG) indicates that vtRNA expression is strongly reduced in this condition. Does this effect hold true also for other classes of pol III transcripts (e.g. Y RNAs)? If so, this finding might point towards a general reduction of pol III activity in LGG.

Minor points:
The authors should reformulate their statement in the introduction "VtRNA2-1 has also been shown to act as a microRNA precursor", as the vtRNA2-1-derived fragment was not convincingly proven to be a bona fide microRNA and the hsa-miR-886 entry has been retracted in the current miRBase release 22.1 6 .
Since lower patient survival of cancer patients was associated with lower vtRNA promoter methylation and not with higher promoter accessibility, the paragraph "High chromatin accessibility at the promoter of vtRNA1-1, vtRNA1-2 and vtRNA2-1 is associated with low patient overall survival" should be renamed. In light of the statement "DNA methylation is a less direct and less accurate measurement of chromatin accessibility compared to ATAC-seq" (page 16), it is questionable whether the relationship between DNA methylation and patient survival is due to increased vtRNA expression, as a similar result would be expected also from the analysis of the ATAC-seq data. Is this simply due to a larger samples size of the DNA methylation dataset?
The authors do not mention that a pseudogene of vtRNA2-1 (vtRNA2-2P) has been annotated at chr2:65,555,432-65,555,534. Is there any reason for that? VtRNA2-2P could be included as proof of concept similarly to vtRNA3-1P on page 8. Page 3: "…the presence of silencing mutations at the B box element of its promoter B-box". Since the organization of RNA polymerase III promoters has not been explained before, it would be better to keep this statement more general without mentioning the B-box.
Partly question arises whether a key message of the study is that specific vtRNA genes are up-or downregulated in certain cancer conditions or whether a key point is that the relative expression levels of the four vtRNA genes are altered. If so, the authors should propose a model in the discussion section.
Authors response: We are grateful for the evaluation of our manuscript entitled "Pan-Cancer chromatin analysis of the human vtRNA genes uncovers their association with cancer biology", since we find these constructive comments rather pertinent and valuable.
We tried to answer all the points raised by the Reviewer and hope it led us to the improved of the manuscript.
It is true that the article is long. We believe the length is justified by the type of information it provides. In particular, the description of the patterns of vtRNA inferred expression involves four genes in 20 cancer types assessed by two methods, plus normal tissues data for DNA methylation.
We think that the message of the study is fairly written in the abstract and conclusion, neither of which prioritize the inferred relative expression of the vtRNAs per se, but what in the inferred expression pattern is suggestive of the involvement of the four genes in cancer. The article aimed to use the chromatin status of the genes to withdraw their candidate roles in cancer tissue. In our view the figures mentioned by the Reviewer are about their pattern of expression in cancer and immune subtypes, regulation, and the derived functional implications of those findings. Nevertheless, the narrative requires a description of their relative abundance and change, while their comparison helps to visualize how different they are expressed and the implication for their function in normal and tumor tissue.

Additional major points:
Reviewers' comment: The major limitation of this study is that vtRNA expression was not analyzed directly but was inferred from chromatin accessibility data (ATAC-seq) as well as DNA methylation data. Nevertheless, the authors honestly point out this limitation and provide convincing evidences from literature that support the validity of their study. Taken together, this approach sounds reasonable. Nevertheless, the findings could be further strengthened by analyzing TCGA miRNA-seq datasets for the expression of vtRNA-derived small RNA fragments. This strategy has been applied in previous studies to infer the expression of snoRNAs 4 and tRNAs 5 across different cancer types, assuming that small processing products correlate with expression levels of the full-length transcripts. Indeed snoRNAs and in particular tRNAs are ncRNA with similar sequencing biases as vtRNAs.
Authors response: We agree the small RNA sequencing analysis proposed by the Reviewer would be valuable to provide further support for the expression of the vtRNAs inferred from the chromatin status. Indeed, we have done it previously while studying snc886-3p, a microRNA-like RNA derived from vtRNA2-1 (published in Fort et al., 2020). There we showed that nc886 and snc886-3p expression (determined by qRT-PCR) negatively correlate with DNA methylation in six PrCa cell lines analyzed (r Spearman -0.71).
Nevertheless, since vtRNAs derived fragments are not quantified with the TCGA small RNA workflow, we need to request access to protected controlled-access data -primary sequencing data (BAM or FASTQ files) to identify them. We understand that this approach and the subsequent data analyses exceed the extent of the current manuscript.
Attempting to follow the Reviewer's recommendation using readily available data, we analyzed the NCI-60 cell lines panel, whose genome DNA methylation and small RNA-seq data is published. Indeed, the 4 vtRNA are known to be subject to posttranscriptional processing at some extent (Hussain et al., 2013;Sajini et al., 2019). We found that vtRNA2-1 produces more specific small RNAs than the other three vtRNAs, thus arguing in favor of a specialization of vtRNA2-1 as a Dicer substrate, which may explain the better correlation between the precursor and the small RNAs derived from it (see Panel A of the figure attached to this response; Figure_reviewer_response-I. Zenodo. http://doi.org/10.5281/zenodo.4784690). In addition, vtRNA2-1 generates more than 30 times small RNAs that the other vtRNAs (see Panel B of the figure attached to this response). As anticipated, the expression of the small RNA fragments derived from vtRNA2-1 shows a better correlation with promoter methylation in the NCI-60 cell lines (r= -0.77). Meanwhile, the small RNAs derived from vtRNAs of cluster 1 are less abundant and less associated to promoter methylation (r= -0.19--0.24) (see Panel C of the figure attached to this response).
In addition, the average abundance of the small RNAs derived from the vtRNAs of cluster 1 correlates with the relative methylation of their promoters (small RNAs derived from vtRNA1-1> vtRNA1-2> vtRNA1-3 and promoter methylation vtRNA1-1< vtRNA1-2< vtRNA1-3) (see Panel B of the figure attached to this response), in agreement with the TCGA data presented in our manuscript. Since small RNAs derived from vtRNA2-1 are more abundant than those derived from the other three vtRNAs, the relative expression of its fragments does not correlate with the relative DNA methylation of the others. In summary, the analysis of NCI-60 cell lines, a panel comprising all tissue types, confirms that the abundance of the small RNAs derived from vtRNAs negatively correlates with their promoter methylation but indicated it may be useful just for vtRNA2-1. The latter, together with the low abundance of the small RNAs derived from the vtRNAs and their processing (Hussain et al., 2013;Sajini et al., 2019), suggest that the abundance of small RNAs in the TCGA small RNA-seq data may not be a good proxy for vtRNA expression.
As discussed above, the post-transcriptional regulation of vtRNA seems to represent an additional layer of complexity of vtRNA biology that includes the processing into small RNAs. We speculate that only some of the fragments derived from specific vtRNAs may have functional significance, but it is likely reached by mechanisms different from those in which their precursors vtRNAs participate. We believe small RNAs derived from vtRNA are an interesting subject of investigation, requiring an independent study, which incorporates the biological mechanism in which they play a part.

Reviewers' comment:
Finally, in order to be more accurate about the limitation of the study mentioned by the Reviewer, we substituted the term "expression" for "chromatin status" or related terms when was appropriate.
The analysis of TF occupancy at vtRNA loci is, in my opinion, rather speculative and vague. Since vtRNAs are transcribed by RNA polymerase III (pol III), is there any clear and direct evidence from literature that the identified TFs regulate Pol III activity (except for the TFIIIA, TFIIIB and TFIIIC complexes)? Could the binding of these TFs rather regulate transcription of adjacent RNA polymerase II genes and not of vtRNAs? The authors might at least briefly touch these aspects in the discussion.
Authors response: We welcome the Reviewers comment since we had a similar concern about the significance of these observations. Yet, we reasoned that the identification of viral related transcription factors (TFs) near vtRNA genes was a provocative finding in the context of their involvement in viral infection and innate immune response. In addition, RNA Pol III (Pol III) has been linked to the innate immune response at different levels, thus it is tempting to speculate that the Pol III regulated genes are coordinately regulated by viral related TFs dependently or independently of RNA Pol II (Pol II) action.
The identified TFs that are shared by the vtRNA (23 TFs) comprise TFs known to be part of the Pol III basal machinery (3/23), or to be direct (6/23) or indirect interactors (3/23) of Pol III machinery. This observation endorses their participation in the vtRNA genes transcription.
There are no Pol II genes near or overlapping the vtRNA genes, thus this in an unlikely explanation for TF occupancy in the vicinity of the genes. Indeed, several lines of evidence have shown that Pol II and some of its specific TFs affect Pol III activity (White, 2011). The localization of the Pol II and its specific TFs (FOS, JUN, ETS1) near Pol III promoters (200bp upstream of TSS) in human cell lines, the global regulation by Pol II specific TFs (p53, MAF1, MYC, RB), the regulation of Pol III by Pol II activity, the sharing chromatin marks and the basal TFIIIB component TBP. The direct interaction between specific TFs and TFIIIB is also known to modulate Pol III transcription.
It has been proposed that Pol II may assist RNA pol III activity by either binding, transcription, or stalling, contributing to make the chromatin more permissive for transcription by Pol III. Curiously, Raha et al demonstrated a reduction of the three vtRNAs of locus 1 after α-amanitin treatment and the co-localization of Pol III and Pol II near these genes (Raha et al., 2010). Additional evidence of the Pol II regulation of specific Pol III transcribed genes has been published, including tRNAs (mediated by MAF1 (Gerber et al., 2020), GAF1 (Rodríguez-López et al., 2020), U6 RNA (OCT1, SP1, STAF, reviewed in (Hirsch et al., 2000)) Alu elements (mediated by YY1, SP1, MEG2, FOS, EP300, AP-1, (Zhang et al., 2019 and references therein). Finally, the crosstalk between the two polymerases seems to be tissue and gene specific, suggesting that Pol II-Pol III crosstalk may be finer that previously envisioned.
To amend the lack of discussion of this aspect we included the following sentence in the Discussion of the manuscript: "It is worth to mention that RNA polymerase II TFs clustered near TSS of RNA polymerase III transcripts could modulate occupancy and transcription rate of RNA polymerase III through RNA pol II activity (Campbell and White, 2014;Gerber et al., 2020;Raha et al., 2010;Zhang et al., 2019). Specifically, RNA polymerase II could assist chromatin opening, thus allowing accessibility of RNA polymerase III or its associated factors in the human genome (Raha et al., 2010)." vtRNAs. Interestingly, RNY3 and RNY4 show a conspicuously higher chromatin accessibility in LGG compared with other tissues, which is opposite to the lower accessibility observed for the vtRNAs These observations raise a possible RNA pol III misregulation in LGG tumors, which must be gene specific to justify opposite outcomes for RNAY and vtRNAs. The modified Supplementary Figure 3 and modified Supplementary Table 1 are now included in the manuscript.

Reviewers' comment:
The authors should reformulate their statement in the introduction "VtRNA2-1 has also been shown to act as a microRNA precursor", as the vtRNA2-1-derived fragment was not convincingly proven to be a bona fide microRNA and the hsa-miR-886 entry has been retracted in the current miRBase release 22.1 6 .
Authors response: Following the advice, we modified the sentences related to small RNAs derived from vtRNA2-1. "VtRNA2-1 has also been shown to generate small RNAs that could act like microRNAs in different tissues,…".

Authors response:
We are sorry for the mistake; we have corrected the name of the gene.

Reviewers' comment:
Since lower patient survival of cancer patients was associated with lower vtRNA promoter methylation and not with higher promoter accessibility, the paragraph "High chromatin accessibility at the promoter of vtRNA1-1, vtRNA1-2 and vtRNA2-1 is associated with low patient overall survival" should be renamed. In light of the statement "DNA methylation is a less direct and less accurate measurement of chromatin accessibility compared to ATAC-seq" (page 16), it is questionable whether the relationship between DNA methylation and patient survival is due to increased vtRNA expression, as a similar result would be expected also from the analysis of the ATAC-seq data. Is this simply due to a larger samples size of the DNA methylation dataset?
Authors response: We amended the title of the paragraph to "Low DNA methylation at the promoter of vtRNA1-1, vtRNA1-2 and vtRNA2-1 is associated with low patient overall survival." Despite no statistically significant differences in overall survival were observed using ATACseq patient data the curves were like those observed using the promoter DNA methylation. The lack of significance is likely due to the smaller ATAC-seq sample size as anticipated by the Reviewer. In addition, the association between OS and vtRNA promoter DNA