ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

Preserved correlation matrices pinpoint extracellular matrix organization as a critical factor in pancreatic ductal adenocarcinoma

[version 1; peer review: 1 approved, 1 approved with reservations]
PUBLISHED 18 Apr 2023
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Oncology gateway.

This article is included in the Cell & Molecular Biology gateway.

Abstract

Background Gene co-expression correlations frequently signal shared biological functions with coordinated regulation. We hypothesized that maintained correlations might be essential for cellular survival, representing potential vulnerabilities of cancer cells. We aimed to reveal correlations preserved in pancreatic ductal adenocarcinomas (PDAC) across normal and tumor tissues.
Methods: We searched the NCBI GEO for raw microarray data and the TCGA project for RNA-seq data. The microarray dataset consisted of 248 tumors and 108 normal samples, allowing the analysis of 12,210 genes. The RNA-seq dataset incorporated 177 tumors, four normal samples from TCGA, and 248 normal samples from GTEx, enabling the analysis of 21,479 genes. Genes with an altered expression were identified with a Mann-Whitney U test at p<0.01, and a Pearson correlation was performed to identify preserved correlations.
Results: Altogether 371 significant correlations involving 262 genes were preserved across normal samples and tumors in both RNA-seq and gene chip platforms. The identified close-knit gene network is mainly responsible for extracellular matrix organization. Seven genes (SPARC, COL6A3, MMP2, HTRA1, FN1, PALLD, and COL3A1) were heavily overrepresented in maintained correlations, some of them participating in as many as 58 interactions. High expression of 28 genes was linked to poor disease outcome at FDR ≤ 10%,  out of which FN1, an extracellular matrix component, was both overrepresented in maintained correlations and associated with worse overall survival (p = 0.00097, FDR  5%). The growing expression of two genes, MYL12A and MYL12B, across normal tissues, primary, and metastatic tumors may drive the acquisition of motility by cancer cells.
Conclusions: Our results propose novel prognostic biomarkers of PDAC and pinpoint fundamental cellular interactions as potential targets for combination therapies. Furthermore, the presence of significant correlations across different data platforms substantiates the validity of our findings.

Keywords

PDAC, pancreatic cancer, FN1, SPARC, MYL12A

Introduction

With one of the lowest survival rates among any cancer, pancreatic cancer (PC) is a gastrointestinal malignancy with exceptional lethality. Despite its relatively low incidence rates, PC is the seventh leading cause of cancer-related death worldwide and the third in the USA after lung and colorectal cancer. The incidence rates are higher in more developed countries1 and display a rising tendency. By 2030, pancreatic cancer is predicted to be the second leading cause of cancer-related mortality in the USA.2 Approximately 90% of pancreatic cancer cases belong to pancreatic ductal adenocarcinomas (PDAC), arising from precursor lesions termed pancreatic intraepithelial neoplasia. The cause of the disease is multifactorial, but cigarette smoking appears to be one of the primary risk factors. However, alcohol consumption, obesity, aging, diabetes mellitus, and family history are significant contributors as well.35

Unfortunately, most patients seldom exhibit symptoms at the early stages of the disease. The wide range of non-specific symptoms, including jaundice, back- and abdominal pain, and weight loss, hampers precise diagnosis. In addition, existing screening biomarkers, including serum levels of CA19-9, display relatively low sensitivity and specificity.6 As a result, 80-85% of patients are diagnosed with unresectable tumors characterized by regional spread or metastatic disease. Only 24% of diagnosed patients survive their first year, while five-year survival is as low as 10%. Currently, surgery is the only available potentially curative option.7 Based on the extent of the disease, patients are allocated into one of four categories: resectable, borderline resectable, locally advanced, and metastatic. Altogether, only 15-20% of patients are good candidates for pancreatectomy, and the five-year survival after surgery remains low at 25%. Adjuvant therapy after surgery, including gemcitabine plus capecitabine, or modified FOLFIRINOX (oxaliplatin, irinotecan, leucovorin, and fluorouracil), is the preferred option, as both treatment regimens improve overall survival.8,9 The combination therapies based on modified FOLFIRINOX, or gemcitabine with nab-paclitaxel for metastatic pancreatic cancer, became the gold standard.1012 In the recent PRODIGE24-trial, a modified FOLFIRINOX regimen resulted in prolonged disease-free (21.6 vs. 12.8 months) and overall survival (54.4 vs. 35.0 months) compared with gemcitabine monotherapy. Nearly 20% of FOLFIRINOX treated patients were alive at 18 months compared to only 6% of those treated with gemcitabine alone.9 Locally advanced disease with vascular involvement is regularly treated with systemic chemotherapies approved in the metastatic setting. Despite standard chemotherapy, the vast majority of resected patients experience recurrence and die within a short period.

Despite extensive molecular characterization, personalized treatment strategies of PDCA are still limited. About 5-9% of pancreatic cancer patients carry germline BRCA1 or BRCA2 mutations. In 2019, the FDA approved olaparib, a PARP-inhibitor, as maintenance therapy for metastatic patients with BRCA1/BRCA2 mutations who had not progressed during first-line platinum-based chemotherapy. Olaparib therapy improved progression-free survival in this setting compared to placebo and became the first approved targeted therapy for pancreatic cancer.13 Other approaches, including the adoption of immune-checkpoint inhibitors, did not yet translate into clinical benefits, as pancreatic tumors are characterized by highly immuno-suppressive microenvironments.14 The diverse response to treatment sparked initiatives for patient classifications, but classical histopathological features lack clinical utility.15 Based on the further genome-wide screening of transcriptional changes, PDCA has been classified into two,16 three,17 or four18 distinct molecular subtypes. Cross-comparison of these studies revealed reproducible biologic subgroups with some prognostic relevance, but the described molecular subtypes have not yet been validated or translated into successful therapeutic decisions.19 Nevertheless, principal drivers of pancreatic tumorigenesis have been identified, including KRAS, SMAD4, CDKN2A, and TP53.1,20,21 Because so few patients can have a successful surgery, and the number of therapies is limited, identifying novel potential biomarkers for early diagnosis and targeted interventions are of utmost importance.

In this study, our aim was to exploit gene expression alterations between normal and tumor tissue samples. However, instead of searching for genes upregulated in tumors, we approached biomarker selection from a novel perspective. In normal, diploid tissue samples, gene co-expression correlations frequently signal shared biological functions with coordinated regulation. On the other hand, malignant tumors frequently undergo abnormal genetic transformations, but some correlations remain consistent across normal tissues and tumor samples. We speculated that maintained correlations might be vital for cellular proliferation and survival; thus, identifying preserved correlations in gene expression across normal and tumor tissue samples can provide new targets for developing alternate therapies.

Methods

Gene expression data from gene microarrays

We searched the NCBI GEO database (http://www.pubmed.com/geo) for raw microarray gene chip data focusing only on Affymetrix Human Genome U133A HGU133_2 array platforms. Out of the identified 19 datasets, studies based on cell lines and xenograft models were excluded. We have also removed experiments based on pooled samples. Duplicate samples were filtered by examining the expression profiles of the first 20 genes in each sample. In case of duplications, the first published version was retained. Samples with poor description and missing raw data were rejected.

The raw CEL files were MAS5 normalized using the Affy Bioconductor library in the R statistical environment (http://www.r-project.org).22 To reduce the batch effect, a second scaling normalization was performed to set the mean expression on each chip to 1000.23 The appropriate probe set for each gene was selected with the JetSet annotation and correction package.24 We integrated the established gene expression dataset into our previously established TNMplot.com platform,25 allowing future exploitation of the database.

Gene expression data from RNA-seq studies

We searched for RNA-seq-based studies from the TCGA project at the Genomic Data Commons (GDC) portal (https://portal.gdc.cancer.gov/) and the Genotype-Tissue Expression (GTEx) portal. Data from primary tumors and normal samples were obtained from the TCGA, while RNA-seq data from normal pancreatic tissue data were obtained from GTEx. Raw counts were normalized by the DESeq2 R package, after which a second scaling normalization was performed.26

Statistical analysis

Statistical analysis and data processing were performed within the R software package (version 4.0.4). Genes with median expression below the cutoff value of 500 were excluded from subsequent analyses to reduce the chance of noise discoveries. Expression differences for each gene between normal and tumor tissues were compared with the Mann-Whitney U test. Statistical significance was accepted at p < 0.01. We used the Benjamini-Hochberg method to correct for multiple comparisons by calculating the false discovery rate (FDR). Pearson correlation analysis was performed on genes common across the two datasets. A Kruskal-Wallis test was conducted to compare gene expression across normal, tumor, and metastatic tissue samples. The ggplot2 R package was used for data visualization.27 For selected genes, survival analysis was performed by Cox proportional hazards regression using all cutoff values between the lower and upper quartiles of expression. In addition, Kaplan-Meier plots were drawn to visualize survival differences. Only genes performing at FDR ≤ 10% were considered significant for selecting the most robust biomarker candidates.

Gene enrichment and network analysis

For the determination of functionally associated genes, we used the clusterProfiler28 R package and Enrichr, an open-source web-based gene enrichment analysis tool (https://maayanlab.cloud/Enrichr/), that integrates results from multiple libraries.29 Within Enrichr, the list of pathways related to our gene list was determined with the KEGG pathways option. Pathways with a Fisher exact test p-value < 0.05 and high combined score were selected. Biological processes associated with the selected list of genes were determined by GO Biological Process function. Only the top 20 significant terms were considered. The listed q-values are based on adjusted p-values corrected for multiple hypothesis testing using the Benjamini-Hochberg method. We used the CytoScape platform30 and the built-in stringApp31 for network analysis.

Results

The combined microarray (GEO) dataset consists of 248 pancreatic tumors and 108 normal tissue samples, allowing the analysis of 12,210 genes. The RNA-seq dataset incorporates 177 tumors and four normal samples from TCGA, and 248 normal tissue samples from GTEx, enabling the analysis of 21,479 genes.

Each dataset underwent multistep filtering: gene expression differences were assessed between tumor and normal tissue samples based on log2FC values. Genes with log2FC > 0.5 or log2FC < -0.5 from the GEO dataset, and genes with log2FC > 1 or log2FC < -1 from the RNA-seq dataset were incorporated into subsequent analyses. Altogether 6088 genes had altered expression at p < 0.01 across tumor and normal tissue samples in the GEO dataset, while 17,967 genes at p < 0.01 in the RNA-seq dataset. Genes with median expression below the cutoff value of 500 were excluded from subsequent analyses to reduce the chance of noise discoveries; therefore, the number of genes was reduced to n = 2,205 in the GEO dataset and n = 3,773 in the RNA-seq dataset.

We identified 309 common genes in both RNA-seq and microarray datasets with altered gene expression across normal and tumor tissues (Figure 1A). In general, more overexpressed genes were identified in tumors compared to normal tissues. Out of the 309 genes, 267 had increased expression in primary tumors compared to 42 genes overexpressed in normal samples, both in GEO and RNA-seq datasets. Genes with altered expression across normal and tumor tissues are graphically represented by a VOLCANO plot (Figure 1B).

374e4f7a-7037-4cce-a2bc-3e09f98705fc_figure1.gif

Figure 1. The process of data selection from the TCGA and GEO databases (A). Genes with altered expression across tumors and normal tissue samples, in RNA-seq (green) and gene chip (red) datasets (B). Representation of the gene ontology analysis, including the 262 genes participating in maintained correlations. The size of the dot indicates the number of genes associated with a particular biological process (C).

Preserved correlations across normal and PDAC tumor samples

A Pearson correlation was performed to identify associations between the 309 differentially expressed genes and all genes in the gene list. We analyzed correlations in tumors and normal tissue samples separately, then common correlations were identified. We have filtered out correlations with a correlation coefficient below 0.5.

Altogether, we have identified 371 preserved correlations involving 262 individual genes across tumors and normal tissue samples. These preserved correlations were robust, as were independently identified in both microarray and RNA-seq datasets with R > 0.5 or R < -0.5, and p < 0.01 (Supplemental Table 1). The network of significantly correlated genes is illustrated in Figure 1C. The most significant top 20 correlations in both normal tissues and tumor samples in each dataset are presented in Table 1.

Table 1. Top 20 preserved correlations in normal tissue samples (A) and tumor samples (B), in each dataset (RNAseq and GEO), separately.

Gene 1Gene 2Pearson's r RNAseqp-value RNAseqGene 1Gene 2Pearson's r GEOp-value GEO
A)
SPARCPOSTN0.981.23E-174COL1A2COL3A10.973.48E-66
COL1A2COL5A10.981.67E-172COL1A2COL6A30.925.45E-44
SPARCCOL1A10.985.43E-170COL6A3COL5A20.915.13E-42
COL1A2COL5A20.975.53E-162ANXA2S100A100.91.16E-40
SPARCTHBS20.972.01E-160MMP2LOXL10.91.66E-39
FN1COL1A10.977.29E-159COL3A1SPARC0.93.32E-39
FN1COL10A10.963.48E-146SPARCCOL5A20.95.49E-39
COL3A1SULF10.964.37E-141COL3A1COL6A30.897.00E-39
FN1NOX40.962.78E-140COL3A1CDH110.891.03E-38
COL1A2COL11A10.961.60E-137COL3A1COL5A20.891.25E-38
COL3A1CTSK0.959.39E-133SPARCC1S0.891.76E-38
SPARCCOL5A10.958.28E-132COL3A1LUM0.881.26E-36
FN1COL5A10.953.51E-128SPARCDACT10.883.16E-36
COL3A1CDH110.952.39E-125SPARCFBN10.885.65E-36
SP3YTHDF30.944.29E-122FN1COL5A10.886.99E-36
COL1A2COL3A10.941.78E-121COL1A2COL5A20.881.64E-35
SPARCHOPX0.941.10E-120COL6A3CDH110.882.24E-35
COL3A1COL5A20.942.52E-120SPARCFSTL10.882.29E-35
SPARCDPYSL30.943.08E-120SPARCCDH110.871.13E-34
COL1A2LUM0.943.24E-118HSPH1BAG30.871.76E-34
B)
TPM2CNN10.917.69E-69SPARCCDH110.877.85E-78
TPM2MAB21L20.92.36E-63FN1POSTN0.871.05E-76
SP3YTHDF30.882.99E-59SPARCGLT8D20.872.16E-76
TPM2MYL90.872.37E-54SPARCFBN10.867.72E-75
TPM2FLNA0.861.64E-53MT1XMT2A0.862.74E-72
TPM2MYH110.832.03E-46SPARCLUM0.851.13E-71
COL6A1COL6A20.838.13E-46COL3A1COL1A20.851.64E-71
STAU1ZPR10.832.75E-45COL6A3COL5A20.852.08E-71
HSPH1DNAJB10.827.99E-45SPARCASPN0.852.52E-71
TPM2MYLK0.821.16E-44SPARCACTA20.845.28E-67
RNPS1U2AF20.821.19E-43HTRA1COL5A10.841.60E-66
ACTR3GNPTAB0.811.91E-42SPARCCLIC40.831.73E-65
HSPH1HSPA1B0.791.68E-39HTRA1AEBP10.834.37E-65
PSMD2SNRPD30.794.08E-39REG1AREG1B0.834.07E-63
MMP2CTSK0.784.70E-37PALLDCALD10.821.95E-60
PSMD2HSPA40.784.71E-37SPARCCOL5A20.822.87E-60
TPM2TAGLN0.771.29E-36SPARCFAP0.813.78E-60
HTRA1AEBP10.772.50E-36S100A11S100A60.817.93E-60
PGRMC1EIF1B0.741.79E-32MMP2EMILIN10.813.28E-58
ALDOAGPI0.749.11E-32FN1COL5A20.819.45E-58

Altered strength of correlations across normal samples and tumors

We observed differences in the strength of preserved correlations between normal tissues and tumor samples. In normal samples, the strongest correlations involved mainly genes encoding collagens both in RNAseq and GEO data. Conversely, in tumors, we observed a shift in the strength of correlations. Genes associated with the regulation of focal adhesion and actin cytoskeleton (ACTR3, MYH11, MYL9, MYLK) along with members of the Heat Shock Protein Family (e.g., HSPA4, HSPH1, HSPA1B), gained momentum and participated in the strongest correlations compared to normal samples (Table 1).

Gene overrepresentations

Seventy-nine genes from the identified 309 differentially expressed genes participated in preserved correlations, out of which 73 genes were upregulated in tumors compared to normal tissue samples (Supplemental Table 2).

Some genes overexpressed in tumor samples were strikingly overrepresented among the 371 preserved correlations, including SPARC, COL6A3, MMP2, HTRA1, FN1, PALLD, and COL3A1, some participating in as many as 58 correlations (Figure 2A). Over 70% of all preserved correlations (263 altogether) involved at least one of these seven genes.

374e4f7a-7037-4cce-a2bc-3e09f98705fc_figure2.gif

Figure 2. Seven genes were heavily overrepresented in maintained correlations, among those SPARC participating in as many as 58 correlations (A). Increased expression of 28 genes from preserved correlations was associated with worse overall survival in pancreatic cancer patients, including ENO1, DSG2, CLIC1, ADAM10, ACTR3, FN1, MYL12A, and COL4A2 (B). A STRING protein network analysis including the 79 genes with differential expression from preserved correlations. FN1 has a central role in the associated network of proteins (C).

Survival analysis

Out of the 262 genes participating in 371 preserved correlations, high expression of 28 genes was associated with worse overall survival at FDR 10%. Out of these, 16 genes (DSG2, FN1, ACTR3, PKM, CAPZB, TSC22D1, COL4A2, ADAM10, ENO1, CLIC1, PARP12, PPP4R1, PSMC4, KRT18, MYL12A, and PSMD2) were upregulated in tumors compared to normal samples (Table 2, Figure 2B).

Table 2. 28 genes were associated with dismal disease outcomes out of the 262 genes participating in preserved correlations.

Out of these 16 genes were upregulated in tumors compared to normal tissue samples. Genes in the boldface were overexpressed in tumors.

GeneFalse discovery rate %p-value
ENO110.0000082
ITGB110.0002
DSG210.00021
CLIC110.00031
MMP1420.00048
ACTR320.00054
SULF130.00062
PSMD230.00076
PSMC430.00077
POSTN30.00089
FN150.00097
ADAM1050.0011
FEN150.0013
GNPTAB50.0014
MT1X50.0015
PPP4R150.0015
CAPZB50.0015
SRPX250.0015
PKM100.0019
PLAU100.0019
MYL12A100.0023
ZPR1100.0024
TSC22D1100.0026
PARP12100.0028
KRT18100.0029
COL4A2100.003
COL1A1100.0032
CSNK1D100.0032

Among the upregulated genes the expression of ENO1 (p = 8.2E-06), DSG2 (p = 0.00021), CLIC1 (p = 0.00031) and ACTR3 (p = 0.00054) showed a particularly strong association with disease outcome (FDR ≤ 2%) (Figure 2B). The heavily overrepresented FN1 gene was also associated with worse overall survival at FDR 5% (p = 0.00097), providing promising biomarker candidates for subsequent clinical validations. According to a String functional network analysis, the FN1 protein has a central role in the associated protein network of upregulated genes (Figure 2C).

Gene expression alterations during tumor progression

The 371 preserved correlations incorporated 79 of the initially identified differentially expressed genes, out of which the majority, 73 genes, were overexpressed in tumors compared to normal tissues. With the help of TNMplot.com, we have also compared the expression pattern of these genes between primary tumors and metastases. The expression of MYL12A increased most significantly in metastatic samples compared to primary tumors (Figure 3A). Of note, the expression of both MYL12A and MYL12B gradually intensified across normal samples, primary tumors, and metastatic tissue samples (MYL12A, KW p = 2.57E-11; MYL12B KW p = 1.38E-14). We have observed a similar pattern in seven additional genes, including ENO1 and CLIC1, that were previously identified as robust prognostic biomarkers of survival (Figure 3B).

374e4f7a-7037-4cce-a2bc-3e09f98705fc_figure3.gif

Figure 3. Comparing gene expression across tumors, normal and metastatic samples.

Expression differences are color-coded: red encodes increasing expression from normal to tumor and from tumors to metastatic samples, while green encodes decreasing expression (A). The expression of MYL12A, MYL12B, CLIC1 and ENO1 genes gradually increased from normal tissue samples to primary tumors and metastatic samples (B).

The close-knit network of 262 genes involved in preserved correlations is mainly associated with ECM organization and function (C). Gene ontology analysis involving the 79 differentially expressed genes from preserved correlations revealed similar top functions (D).

Gene ontology

We subjected the 262 genes from preserved correlations to gene enrichment analysis. The top biological functions were Extracellular matrix organization, Extracellular structure organization, and External encapsulating structure organization (Figure 3C). The most significantly enriched terms based on KEGG Pathway analysis were Focal adhesion, Protein digestion and absorption, ECM-Receptor interaction, Tight junction, Proteoglycans in cancer, regulation of actin cytoskeleton, and AGE-RAGE signaling pathway in diabetic complications (Table 3A). According to the GO Cellular compartment analysis in Enrichr, 54 genes were significantly associated with the collagen-containing ECM (Supplemental Table 3).

Table 3. KEGG-Pathway analysis of 262 genes participating in 371 correlations preserved both in normal tissues, and tumor samples A). and KEGG Pathway analysis of 73 genes from preserved correlations overexpressed in tumor samples B).

Top functions remained similar in both cases.

TermP-valueAdjusted P-valueCombined scoreGenes
A)
Focal adhesion2.46E-134.58E-11274.88PDGFRB; ITGB1; LAMA4; FN1; THBS2; ACTB; MYL12A; MYLK; MYL12B; COL1A1; COL1A2; COL4A2; COL4A1; COL6A2; PDGFC; COL6A1; FLNA; COL6A3; RAC1; PPP1R12B; MYL9
Protein digestion and absorption7.55E-117.02E-09290.48COL15A1; COL11A1; COL1A1; COL3A1; COL1A2; COL4A2; COL5A1; COL4A1; COL6A2; COL5A2; COL6A1; COL8A2; COL10A1; COL6A3
ECM-receptor interaction1.69E-091.05E-07250.85ITGB1; COL1A1; COL1A2; COL4A2; COL4A1; COL6A2; LAMA4; COL6A1; FN1; COL6A3; THBS2; HSPG2
Tight junction2.45E-061.14E-0477.35ITGB1; ACTR3; PCNA; HSPA4; MYH11; RAC1; MYL9; MPDZ; ACTB; MYL12A; MYL12B; TJP2
Proteoglycans in cancer3.28E-061.22E-0467.11ITGB1; LUM; MMP2; FN1; HSPG2; ACTB; DCN; COL1A1; COL1A2; PLAU; FLNA; RAC1; PPP1R12B
Regulation of actin cytoskeleton6.43E-061.77E-0459.47PDGFRB; ITGB1; ACTR3; FN1; ACTB; MYL12A; MYLK; MYL12B; PDGFC; MYH11; RAC1; PPP1R12B; MYL9
AGE-RAGE signaling pathway in diabetic complications6.67E-061.77E-0491.54COL1A1; COL3A1; COL1A2; COL4A2; COL4A1; MMP2; FN1; NOX4; RAC1
Leukocyte transendothelial migration1.94E-054.51E-0472.18ITGB1; MMP2; CTNNA1; THY1; RAC1; MYL9; ACTB; MYL12A; MYL12B
Amoebiasis5.91E-050.0012264.08COL1A1; COL3A1; COL1A2; COL4A2; IL1R1; COL4A1; LAMA4; FN1
Bacterial invasion of epithelial cells6.73E-050.0012274.10ITGB1; ACTR3; CTNNA1; FN1; RAC1; ACTB; CD2AP
PI3K-Akt signaling pathway7.21E-050.0012233.14PDGFRB; ITGB1; LAMA4; FN1; THBS2; YWHAZ; COL1A1; COL1A2; COL4A2; COL4A1; COL6A2; PDGFC; COL6A1; COL6A3; RAC1
Platelet activation2.32E-040.003491244.59COL1A1; ITGB1; COL3A1; COL1A2; ACTB; MYL12A; MYLK; MYL12B
Glycolysis/Gluconeogenesis2.44E-040.003491262.89GPI; LDHA; PKM; PGK1; ENO1; ALDOA
Human papillomavirus infection4.51E-040.005990124.56PDGFRB; ITGB1; LAMA4; FN1; THBS2; COL1A1; COL1A2; PKM; COL4A2; COL4A1; COL6A2; COL6A1; COL6A3
Small cell lung cancer0.00132870.016476535.47ITGB1; COL4A2; COL4A1; LAMA4; FN1; E2F3
B)
Focal adhesion3.01E-093.52E-07321.8192187COL1A2; COL4A2; COL4A1; COL6A1; FN1; COL6A3; RAC1; ACTB; MYL12A; MYL12B
ECM-receptor interaction3.23E-081.89E-06448.2541306COL1A2; COL4A2; COL4A1; COL6A1; FN1; COL6A3; HSPG2
AGE-RAGE signaling pathway in diabetic complications7.85E-083.06E-06370.0434977COL3A1; COL1A2; COL4A2; COL4A1; MMP2; FN1; RAC1
Bacterial invasion of epithelial cells3.70E-071.08E-05370.86943ACTR3; CTNNA1; FN1; RAC1; ACTB; CD2AP
Protein digestion and absorption2.07E-064.85E-05239.5764132COL3A1; COL1A2; COL4A2; COL4A1; COL6A1; COL6A3
Salmonella infection3.45E-066.26E-05126.4426771ACTR3; ANXA2; AHNAK; RAC1; ACTB; MYL12A; MYL12B; S100A10
Leukocyte transendothelial migration3.75E-066.26E-05205.3304748MMP2; CTNNA1; RAC1; ACTB; MYL12A; MYL12B
Glycolysis/Gluconeogenesis4.57E-066.68E-05289.707921LDHA; PKM; PGK1; ENO1; ALDOA
Tight junction3.55E-054.17E-04111.2552482ACTR3; RAC1; ACTB; MYL12A; MYL12B; TJP2
Amoebiasis3.57E-054.17E-04153.9483051COL3A1; COL1A2; COL4A2; COL4A1; FN1
PI3K-Akt signaling pathway4.39E-054.67E-0469.87964311COL1A2; COL4A2; COL4A1; COL6A1; FN1; COL6A3; RAC1; YWHAZ
HIF-1 signaling pathway4.90E-054.78E-04139.0708365LDHA; PGK1; ENO1; TIMP1; ALDOA
Platelet activation9.06E-058.15E-04113.9377743COL3A1; COL1A2; ACTB; MYL12A; MYL12B
Proteoglycans in cancer1.04E-048.52E-0481.45091853COL1A2; MMP2; FN1; RAC1; HSPG2; ACTB
Relaxin signaling pathway1.09E-048.52E-04107.1230664COL3A1; COL1A2; COL4A2; COL4A1; MMP2

The ITGB1, COL1A2, COL4A1, COL4A2, and FN1 genes were predicted to be involved in multiple cancer-related pathways, such as ECM-receptor interactions, the PI3K-Akt signaling pathway, human papillomavirus infection, and small cell lung cancer.

When we narrowed down the gene enrichment analysis to the 73 genes overexpressed in tumors participating in preserved correlations, the most significantly enriched biological functions and KEGG pathways remained similar (Table 3B, Figure 3D).

Discussion

Pancreatic cancer is one of the most lethal malignancies with limited options of therapies and biomarkers. In order to pinpoint survival-specific vulnerabilities, we aimed to identify upregulated genes with preserved correlations across normal and tumor samples. Previous research revealed profoundly different biomolecular networks underlying cancer conditions compared to healthy tissues.32 Notwithstanding, we speculated that sustained correlations might offer insight into fundamental biological functions, providing potential novel targets for therapeutic interventions. To increase robustness, we collected data from two independent platforms, including RNA-seq and microarray-based studies. Then, we identified genes with an altered expression between tumors and normal tissues and looked for correlations preserved across both data platforms.

We have identified 371 preserved correlations comprising 262 genes, including 79 genes with altered expression across tumors and normal samples. These preserved correlations were robust, as were independently identified in both microarray and RNA-seq datasets. Moreover, a few of these differentially expressed genes remarkably dominated preserved correlations, including SPARC, COL6A3, COL3A1, MMP2, HTRA1, FN1, and PALLD. Altogether, 16 genes upregulated in tumor tissues were linked to dismal disease outcomes, out of which the expression of MYL12A gradually increased during disease progression.

Upon close examination, most of these genes are associated with the tumor microenvironment, participating in extracellular matrix (ECM) organization. PDAC has a dense fibrotic stroma, composed of abundant extracellular matrix and other non-neoplastic cell types, responsible for chemo- and radiotherapeutic resistance.33 The abundant ECM deposition, called desmoplastic reaction, plays a crucial role in PDAC. Beyond structural support, the ECM functions as a reservoir of growth factors and a mediator of complex signaling pathways.34,35 The loss of ECM homeostasis and integrity is one of the hallmarks of cancer, resulting in cancer progression36; thus, potential targeting and modifications of the ECM in PDAC are being intensely studied both in the preclinical and clinical settings.37,38

Our results corroborate the importance of ECM organization in disease outcomes. Most importantly, we pinpoint critical interactions as potential targets for combination therapies. Fibronectin (FN1) is a principal constituent of the extracellular matrix that provides the scaffold to build up the entire ECM structure, affecting its architecture and composition. It is overexpressed in PDAC as part of the desmoplastic reaction, is associated with aggressive tumor characteristics,39 and functions as a primary driver of tumor progression by enhancing growth factor signaling.40 FN1 secreted by pancreatic stromal cells promoted gemcitabine resistance by downregulation of ERK1/2 phosphorylation.41 A synthetic FN1 blocking-peptide (Arg-Gly-Asp-Ser) abrogated the gemcitabine resistance in PDAC cell lines.41

Our datasets identified both FN1 and the matricellular protein SPARC remarkably overrepresented in preserved correlations across normal and tumor samples. High stromal expression of SPARC was linked to poor patient prognosis in PDAC,42 and pancreatic stellate–cell-derived SPARC promoted proliferation of PDAC cells in vitro.43 Recent research indicated that the presence of plasma fibronectin determines the specific effect of SPARC on pancreatic cancer cells: depletion of fibronectin switched SPARC from promoting cancer cell proliferation to growth inhibition and induction of apoptosis.43 In our data both FN1 and SPARC expression were correlated with MMP2 in normal and cancer tissue samples. MMP2 is a member of the matrix metalloproteinase (MMP) gene family, primarily located in the tumor stroma, capable of cleaving collagens and other components of the extracellular matrix involved in signal transduction.44 In PDAC, MMP2 is among the most studied gelatinases, with a potential clinical relevance45: MMP2 expression is linked to the activity and the invasive potential of pancreatic cancer cell lines, also associated with the development of the characteristic desmoplastic reaction.46 The potential importance of MMP2 in PDAC progression is supported by studies using selective MMP2, MMP9, and MMP14 inhibitors in mice and Syrian hamsters, such as MMI-166, RO28-2653, OPB-3206, and SB-3CT.45

The most abundant component of the mammalian ECM is collagen. Collagen Type IV Alpha 2 Chain (COL4A2) encodes one of the six subunits of collagen type IV, a major structural component of basal membranes.47 In our dataset, COL4A2 expression was highly correlated with COL4A1, another type IV collagen alpha protein found in most connective tissues. In pancreatic cell lines and xenograft models of pancreatic cancer, type IV collagen production exceeded that of other ECM proteins examined, including collagen type I.48 The abundantly produced collagen type IV is deposited near the pancreatic tumor cells, creating a basal membrane-like structure.49 The structure colocalizes with integrin receptors on the cancer cell surface, stimulating the proliferation and migration of pancreatic cancer cells. Thus, type IV collagens provide essential survival signals to the pancreatic cancer cells through an autocrine loop.49 High circulating level of postoperative collagen IV has been linked to dramatically reduced survival in PDAC patients undergoing curative surgery.50

Substantiating previous findings, we found an association between high COL4A2 expression and poor prognosis, supporting the role of COL4A2 as a potential prognostic biomarker of PDAC. The drug candidate T12, a rationally designed inhibitor of collagen IV network formation, reduced both COL4A1 and COL4A2 expression and disrupted EMT-based chemo-resistance in mouse models of breast and lung cancers without apparent toxicity.51 Thus, targeting specific EMC proteins, including COL4A2 and COL4A1, may represent a novel therapeutic opportunity for treating PDAC.

Another protease, the ADAM Metallopeptidase Domain 10 (ADAM10), has also been heavily investigated by pharmaceutical companies. ADAM10 is ubiquitously expressed and cleaves many substrates involved in cancer signaling, such as PD-L1, NOTCH, EGFR/HER-ligands, among others.52 The evidence for ADAM10 as a suitable cancer biomarker increases: ADAM10 overexpression correlates with poor patient prognosis in cancers of the colon, lung, ovary, uterus, and stomach.53 Moreover, it has a substantial value as a prognostic biomarker in glioblastomas, sacral chordomas, and triple-negative breast cancer and became a potential target.52,54,55 In PDAC, ADAM10 was suggested to be a driver of tumor progression and fibrosis after radiation therapy. Pharmacological inhibition of ADAM10 delayed fibrosis and improved survival in radiation therapy-treated orthotopic and metastatic mouse models.56 Targeting the active ADAM10 with a monoclonal antibody, such as mAb 8C7, that masks the substrate-binding pocket preferentially located in tumors compared to normal tissue may offer targeted inhibition to prevent tumor growth.57

Our results suggest additional insight into the evolving tumor microenvironment during disease progression. The expression of MYL12A was associated with poor prognosis and correlated with MYL12B upregulation within both normal and tumor samples. The expression of both genes increased during tumor progression, from normal tissues to primary tumors to metastatic samples. Myosin Light Chain 12A (MYL12A) and 12B (MYL12B) regulate smooth muscle and non-muscle cell contractile activity. Both genes are implicated in cytokinesis and cell locomotion,58 and intensifying expression levels indicate the acquisition of motility by cancer cells. The biological functions and rising expressions during cancer progression warrant the role of MYL12A both as a potential prognostic biomarker and a therapeutic target of PDAC.

We observed an increase in the expression of ENO1 and CLIC1 genes between normal samples and tumor tissue. Moreover, in our data, overexpression of both ENO1 and CLIC1 was associated with poor overall survival. Alpha enolase (ENO1) is a glycolytic enzyme localized on the cell surface with a complex role in cancer progression: ENO1 promotes glycolysis and ECM degradation, affects actin remodeling, facilitates tumor invasion, and migration, and activates cancer signaling pathways.59 During inflammation, enolase migrates to the cell surface that promotes the production of pro-inflammatory cytokines, chemokines, reactive oxygen species, and nitric oxide.60

Higher ENO1 expression has been associated with larger tumor size, increased invasion, shorter survival, and poor prognosis in several solid tumors, including hepatocellular carcinomas, breast- and lung cancers,59 and PDAC.61 ENO1 has clinicopathological and diagnostic significance as a tumor-associated antigen as well: combining ENO1 autoantibodies with other biomarkers improved diagnostic sensitivity and accuracy of lung cancer detection.62,63 In PDAC, several fundamental characteristics designate ENO1 to be an ideal biomarker candidate: it is localized on a cell surface, making it targetable; it is upregulated in tumors compared to normal tissues, is associated with clinical outcome, and has vital metabolic functions. Small molecule inhibitors (e.g., ENOblock, ENO1i) are already being investigated in various preclinical settings and disease models, including spinal cord injuries and multiple myelomas.64,65 Additionally, DNA vaccinations targeting ENO1 in combination with chemotherapy may offer promising alternatives in PDAC treatment. In pancreatic cancer-prone mice, treatment with gemcitabine combined with anti-ENO1 vaccine elicited a robust antitumor response and reduced the size of PDAC lesions.66 Administration of an anti-ENO1 mAb inhibited in vitro invasion of human PDAC cells and their metastatic spread in immunocompromised mice; thus, targeting ENO1 may be exploited as a novel therapy to increase the survival of metastatic PDAC patients.67

PDAC presents unique challenges for chemotherapy and immunotherapy with an immunosuppressive microenvironment and a dense stroma, creating a physical barrier to drug delivery; thus, alternative therapeutic targets, such as ion channels, provide an increasingly evaluated option. Ion channels are involved in every aspect of oncogenesis, influencing critical processes within each hallmark of cancer. Chloride Intracellular Channel 1 (CLIC1) plays a vital role in many physiological functions, including ion homeostasis, acidity, and cellular volume, considered to be both the sensor and executor of oxidative stress. It can exist in a soluble form, but in pathological states, CLIC1 is expressed explicitly as a transmembrane chloride channel.68 CLIC1 overexpression has been established in pancreatic cancer tissue and is indicated to be a putative oncogene.6971

In summary, our findings provide novel insight into intimate gene-networks corroborating previous knowledge about interactions within the tumor microenvironment. Although our results are based on in silico approaches, maintained correlations across RNA-seq and gene chip platforms substantiate the significance of our findings. Our results may be utilized to develop combination therapies against PDAC and offer novel ideas to be explored in future studies.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 18 Apr 2023
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Menyhart O, Bartha Á and Győrffy B. Preserved correlation matrices pinpoint extracellular matrix organization as a critical factor in pancreatic ductal adenocarcinoma [version 1; peer review: 1 approved, 1 approved with reservations]. F1000Research 2023, 12:418 (https://doi.org/10.12688/f1000research.131414.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 18 Apr 2023
Views
4
Cite
Reviewer Report 25 Jan 2025
Howard Crawford, Michigan State University, East Lansing, Michigan, USA 
Approved with Reservations
VIEWS 4
In this manuscript, the authors have mined publicly available microarray data to explore gene expression associations with patient outcome.

While the computational analysis reported is of very high quality, it is unclear what new has been learned ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Crawford H. Reviewer Report For: Preserved correlation matrices pinpoint extracellular matrix organization as a critical factor in pancreatic ductal adenocarcinoma [version 1; peer review: 1 approved, 1 approved with reservations]. F1000Research 2023, 12:418 (https://doi.org/10.5256/f1000research.144255.r353194)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
2
Cite
Reviewer Report 16 Sep 2024
Stephan Joel Reshkin, University of Bari, Bari, Italy 
Approved
VIEWS 2
Concerning the manuscript entitled “Preserved correlation matrices pinpoint extracellular matrix organization as a critical factor in pancreatic ductal adenocarcinoma” by Menyhart, Bartha and Gyorffy. This is a very interesting study using state-of-the-art statistical methods with various data sets to find ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Reshkin SJ. Reviewer Report For: Preserved correlation matrices pinpoint extracellular matrix organization as a critical factor in pancreatic ductal adenocarcinoma [version 1; peer review: 1 approved, 1 approved with reservations]. F1000Research 2023, 12:418 (https://doi.org/10.5256/f1000research.144255.r281371)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 18 Apr 2023
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.