ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

Mapping of microRNAs related to cervical cancer in Latin American human genomic variants

[version 1; peer review: 2 approved with reservations]
PUBLISHED 20 Jun 2017
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

Background: MicroRNAs are related to human cancers, including cervical cancer (CC), which is mainly caused by human papillomavirus (HPV) infection. In 2012, approximately 70000 cases and 28000 deaths from this cancer were registered in Latin America according to GLOBOCAN reports. The most frequent genotype worldwide is HPV-16. The main molecular mechanism of HPV in CC is related to integration of viral DNA into the hosts’ genome. However, the different variants in the human genome can result in different integration mechanisms, specifically involving microRNAs (miRNAs).
Methods: miRNA sequences associated with CC and four human genome variants from Latin American populations were obtained from miRBase and the 1000 Genomes Browser, respectively. HPV integration sites near cell cycle regulatory genes were identified. miRNAs were mapped on human genomic variants. miRSNPs (single nucleotide polymorphisms in miRNAs) were identified in the miRNA sequences located at HPV integration sites on the human genomic Latin American variants. 
Results: Two hundred seventy-two miRNAs associated with CC were identified in 139 reports from different geographic locations. By mapping with the Blast-Like Alignment Tool (BLAT), 2028 binding sites were identified from these miRNAs on the human genome (version GRCh38/hg38); 42 miRNAs were located on unique integration sites; and miR-5095, miR-548c-5p and miR-548d-5p were involved with multiple genes related to the cell cycle. Thirty-seven miRNAs were mapped on the human Latin American genomic variants (PUR, MXL, CLM and PEL), but only miR-11-3p, miR-31-3p, miR-107, miR-133a-3p, miR-133a-5p, miR-133b, miR-215-5p, miR-491-3p, miR-548d-5p and miR-944 were conserved.
Conclusions: 10 miRNAs were conserved in the four human genome variants, and in the remaining 27 miRNAs, substitutions, deletions or insertions were observed in the nucleotide sequences. This variability can imply differentiated mechanisms towards each genomic variant in human populations, relative to specific genomic patterns and geographic features. These findings may be decisive in determining susceptibility to the development of CC. Further identification of cellular genes and signalling pathways involved in CC progression could lead to the development of new therapeutic strategies based on miRNAs.

Keywords

cervical cancer, HPV, HPV integration sites, microRNAs, miRNAs, secondary structure, human genome variants, bioinformatics tools

Introduction

Cervical cancer (CC) is the second most common malignancy in women worldwide. According to GLOBOCAN reports, approximately 530,000 women are diagnosed with CC and 265,672 die from it each year1. Infection by human papillomavirus (HPV) has been recognized as the major risk factor in this pathology2,3, but the virus presence is not the main cause for the development of this cancer4,5. Viral DNA integration into the host cell genome is considered a conducive factor for cervical intraepithelial neoplasia (CIN) to develop into CC57.

Numerous microRNAs (miRNAs) have been identified in proximity to HPV integration sites8,9. miRNAs are a class of small (18 to 26 nucleotides length), noncoding, evolutionarily conserved RNAs that are processed from longer transcripts known as pre-miRNAs (60 to 100 nucleotides in length)10. They are located on regions known as fragile sites and distributed in intergenic, intronic and exonic segments of the human genome involved in cancer11,12. Functionally, they regulate post-transcriptional expression levels of up to 60% of total protein-encoding genes by binding their seed sequences (2–8 nucleotides length). The 5'-UTR end of the miRNA seed sequence is complementary to the 3'-UTR end of the target mRNAs13. This recognition event can affect the expression of important regulatory genes. Deregulation of genes such as tumour suppressor genes and oncogenes can lead to cancer development, including CC1416.

Human genome variants generate different patterns of miRNA deregulation17, which can contribute to cancer development susceptibility, treatment efficacy and patient prognosis1820. 99% of the human genome is genetically identical, and the remaining 1% is responsible for all human diversity. miRNAs represent a major part of this genetic variation21. miRSNPs (single nucleotide polymorphisms in miRNAs) are human polymorphisms at or near predicted miRNA target sites22. The occurrence of miRSNPs can influence miRNA functionality on all levels, including transcription, maturation, and mRNA target binding.

Knowledge on miRNAs related to CC development in human genome variants from Latin American populations is scarce. Thus, in this study, we mapped miRNAs associated with CC in human genome variants obtained from Colombia, Mexico, Peru and Puerto Rico. Complete genomes were included in this study. Additionally, the relationships between HPV integration sites, genes close to these sites, mapping profiles and mutation patterns for each of the miRNAs were estimated for each of the genome sequences. The objective of this research was to analyse how genetic variation of CC-associated miRNAs identified in previously reported HPV integration sites affects cell cycle regulatory genes in human genomic variants from Latin America.

Methods

miRNA sequences associated with cervical cancer

Two hundred and seventy-two miRNAs associated with CC were selected as described in the systematic review published by Guerrero & Guerrero23. With the information contained in miRBase2426, miRNAMap27 and miRNAstart, features, such as length, chromosomal and genomic location of pre-miRNAs and mature miRNAs, were analysed. The mature miRNA reference sequences were obtained in FASTA format from the miRBase database (Dataset 128).

Latin American human genomic variants

Four human genome sequences were obtained from randomly selected female participants in the 1000 Genomes Project from Latin American populations22,29. Their codes were CLM (from Medellin in Colombia), MXL (from Los Angeles and of Mexican ancestry in the USA), PEL (from Lima in Peru) and PUR (from Puerto Rico). The control sequence was a variant that is phylogenetically distant to Latin American variants and identified with the code BEB (from Bangladesh and of Bengali ancestry). Access codes were obtained from the 1000 Genomes Project resources21,30. This information is summarized in Table 1.

Table 1. Access codes of the four Latin American human genome variants obtained from the NCBI 1000 genomes project.

SEQUENCE
TYPE
SEQUENCE
NAME
DATABASEACCESS
CODE
Genomic
sequence
CLMNCBI 1000
Genomes
Project
HG01432
MXLNA19749
PELHG01566
PURHG00554
BEB (Control)HG03589

Selection, identification and analysis of HPV integration sites near cell cycle regulatory genes

Viral insertion sites and nearby genes on the human genome were identified with the UCSC Genome Bioinformatics search engine31,32. To establish possible functional relationships with the development of CC, functional information on the associated functions of these human genes was obtained from UniProt33,34.

Mapping miRNAs and chromosomal locations on the human genome

According to Xia et al.35, the mature miRNA sequences are located in regions with pre-miRNA secondary structure complementarity (3' and 5'). In total, 445 miRNA sequences were analysed. The Blast-Like Alignment Tool (BLAT) available on the UCSC Genome Bioinformatics website was used for mapping the miRNAs associated with the full human genome with the following parameters: (a) genome, human; (b) assembly, Dec. 2013 (GRCh38/hg38); (c) query type, DNA; (d) sort output, query; and (e) score and output, hyperlinks. A matrix of chromosomal location data was built with Microsoft Excel 2013 (‘Matrix of data’ in Dataset 236). From this matrix, the miRNAs over HPV integration sites were manually identified.

Identification of miRNAs in Latin American human genomic variants

To identify miRNA mutations in the four Latin American human genome variants, the available tools, including ideogram view, subjects and exon navigator, in the NCBI 1000 Genomes Browser (Phase 3, version 3.7) were used. The code for each female genetic variant selection (Colombia, Mexico, Peru, Puerto Rico and Bangladesh) was inserted and the sequence of each miRNA identified in viral integration sites was introduced and the mapped nucleotide positions were selected. Using WebLogo 337, logos were created to view the nucleotide differences. The bioinformatics workflow is summarized in Figure 1.

806f98fd-0737-4cbb-a665-60545a43639c_figure1.gif

Figure 1. Bioinformatic workflow for mapping of miRNAs related to CC on Latin American human genomic variants.

hsa-miR-1-3p MIMAT0000416;hsa-miR-1-5p MIMAT0031892;hsa-miR-1-3p MIMAT0000416;hsa-miR-7-5p MIMAT0000252;hsa-miR-7-1-3p MIMAT0004553;hsa-miR-7-5p MIMAT0000252;hsa-miR-7-2-3p MIMAT0004554;hsa-miR-9-5p MIMAT0000441;hsa-miR-9-3p MIMAT0000442;hsa-miR-9-5p MIMAT0000441
hsa-miR-9-3p MIMAT0000442;hsa-miR-10a-5p MIMAT0000253;hsa-miR-10a-3p MIMAT0004555;hsa-miR-10b-5p MIMAT0000254;hsa-miR-10b-3p MIMAT0004556;hsa-miR-15a-5p MIMAT0000068;hsa-miR-15a-3p MIMAT0004488;hsa-miR-15b-5p MIMAT0000417;hsa-miR-15b-3p MIMAT0004586;hsa-miR-16-5p MIMAT0000069
hsa-miR-16-1-3p MIMAT0004489;hsa-miR-16-5p MIMAT0000069;hsa-miR-16-2-3p MIMAT0004518;hsa-miR-17-5p MIMAT0000070;hsa-miR-17-3p MIMAT0000071;hsa-miR-18a-5p MIMAT0000072;hsa-miR-18a-3p MIMAT0002891;hsa-miR-18b-5p MIMAT0001412;hsa-miR-18b-3p MIMAT0004751;hsa-miR-19a-3p MIMAT0000073
hsa-miR-19a-3p MIMAT0000073;hsa-miR-19b-3p MIMAT0000074;hsa-miR-19b-1-5p MIMAT0004491;hsa-miR-19b-3p MIMAT0000074;hsa-miR-19b-2-5p MIMAT0004492;hsa-miR-20a-5p MIMAT0000075;hsa-miR-20a-3p MIMAT0004493;hsa-miR-20b-5p MIMAT0001413;hsa-miR-20b-3p MIMAT0004752;hsa-miR-21-5p MIMAT0000076
hsa-miR-21-3p MIMAT0004494;hsa-miR-23a-3p MIMAT0000078;hsa-miR-23a-5p MIMAT0004496;hsa-miR-23b-3p MIMAT0000418;hsa-miR-23b-5p MIMAT0004587;hsa-miR-25-3p MIMAT0000081;hsa-miR-25-5p MIMAT0004498;hsa-miR-26a-5p MIMAT0000082;hsa-miR-26a-1-3p MIMAT0004499;hsa-miR-26a-5p MIMAT0000082
hsa-miR-26a-2-3p MIMAT0004681;hsa-miR-26b-5p MIMAT0000083;hsa-miR-26b-3p MIMAT0004500;hsa-miR-27a-3p MIMAT0000084;hsa-miR-27a-5p MIMAT0004501;hsa-miR-27b-3p MIMAT0000419;hsa-miR-27b-5p MIMAT0004588;hsa-miR-28-5p MIMAT0000085;hsa-miR-28-3p MIMAT0004502;hsa-miR-29a-3p MIMAT0000086
hsa-miR-29a-5p MIMAT0004503;hsa-miR-29b-1-5p MIMAT0004514;hsa-miR-29b-3p MIMAT0000100;hsa-miR-29b-2-5p MIMAT0004515;hsa-miR-29c-3p MIMAT0000681;hsa-miR-29c-5p MIMAT0004673;hsa-miR-30a-5p MIMAT0000087;hsa-miR-30a-3p MIMAT0000088;hsa-miR-30b-5p MIMAT0000420;hsa-miR-30b-3p MIMAT0004589
hsa-miR-30d-5p MIMAT0000245;hsa-miR-30d-3p MIMAT0004551;hsa-miR-30e-5p MIMAT0000692;hsa-miR-30e-3p MIMAT0000693;hsa-miR-31-5p MIMAT0000089;hsa-miR-31-3p MIMAT0004504;hsa-miR-34a-5p MIMAT0000255;hsa-miR-34a-3p MIMAT0004557;hsa-miR-34b-3p MIMAT0004676;hsa-miR-34b-5p MIMAT0000685
hsa-miR-34c-5p MIMAT0000686;hsa-miR-34c-3p MIMAT0004677;hsa-miR-92a-3p MIMAT0000092;hsa-miR-92a-1-5p MIMAT0004507;hsa-miR-92a-3p MIMAT0000092;hsa-miR-92a-2-5p MIMAT0004508;hsa-miR-92b-3p MIMAT0003218;hsa-miR-92b-5p MIMAT0004792;hsa-miR-93-5p MIMAT0000093;hsa-miR-93-3p MIMAT0004509
hsa-miR-95-5p MIMAT0026473;hsa-miR-95-3p MIMAT0000094;hsa-miR-98-5p MIMAT0000096;hsa-miR-98-3p MIMAT0022842;hsa-miR-99a-5p MIMAT0000097;hsa-miR-99a-3p MIMAT0004511;hsa-miR-99b-5p MIMAT0000689;hsa-miR-99b-3p MIMAT0004678;hsa-miR-100-5p MIMAT0000098;hsa-miR-100-3p MIMAT0004512
hsa-miR-101-3p MIMAT0000099;hsa-miR-101-5p MIMAT0004513;hsa-miR-101-3p MIMAT0000099;hsa-miR-103a-3p MIMAT0000101;hsa-miR-103a-3p MIMAT0000101;hsa-miR-103a-2-5p MIMAT0009196;hsa-miR-106a-5p MIMAT0000103;hsa-miR-106a-3p MIMAT0004517;hsa-miR-106b-5p MIMAT0000680;hsa-miR-106b-3p MIMAT0004672
hsa-miR-107 MIMAT0000104;hsa-miR-122-5p MIMAT0000421;hsa-miR-122-3p MIMAT0004590;hsa-miR-124-3p MIMAT0000422;hsa-miR-124-5p MIMAT0004591;hsa-miR-124-3p MIMAT0000422;hsa-miR-124-5p MIMAT0004591;hsa-miR-124-3p MIMAT0000422;hsa-miR-124-5p MIMAT0004591;hsa-miR-125a-5p MIMAT0000443
hsa-miR-125a-3p MIMAT0004602;hsa-miR-125b-5p MIMAT0000423;hsa-miR-125b-1-3p MIMAT0004592;hsa-miR-125b-5p MIMAT0000423;hsa-miR-125b-1-3p MIMAT0004592;hsa-miR-126-3p MIMAT0000445;hsa-miR-126-5p MIMAT0000444;hsa-miR-127-3p MIMAT0000446;hsa-miR-127-5p MIMAT0004604;hsa-miR-129-5p MIMAT0000242
hsa-miR-129-1-3p MIMAT0004548;hsa-miR-129-5p MIMAT0000242;hsa-miR-129-2-3p MIMAT0004605;hsa-miR-130a-3p MIMAT0000425;hsa-miR-130a-5p MIMAT0004593;hsa-miR-130b-3p MIMAT0000691;hsa-miR-130b-5p MIMAT0004680;hsa-miR-132-3p MIMAT0000426;hsa-miR-132-5p MIMAT0004594;hsa-miR-133a-3p MIMAT0000427
hsa-miR-133a-5p MIMAT0026478;hsa-miR-133a-3p MIMAT0000427;hsa-miR-133a-5p MIMAT0026478;hsa-miR-133b MIMAT0000770;hsa-miR-134-5p MIMAT0000447;hsa-miR-134-3p MIMAT0026481;hsa-miR-135a-5p MIMAT0000428;hsa-miR-135a-3p MIMAT0004595;hsa-miR-135a-5p MIMAT0000428;hsa-miR-135b-5p MIMAT0000758
hsa-miR-135b-3p MIMAT0004698;hsa-miR-136-5p MIMAT0000448;hsa-miR-136-3p MIMAT0004606;hsa-miR-137 MIMAT0000429;hsa-miR-138-5p MIMAT0000430;hsa-miR-138-1-3p MIMAT0004607;hsa-miR-138-5p MIMAT0000430;hsa-miR-138-2-3p MIMAT0004596;hsa-miR-139-5p MIMAT0000250;hsa-miR-139-3p MIMAT0004552
hsa-miR-140-5p MIMAT0000431;hsa-miR-140-3p MIMAT0004597;hsa-miR-141-3p MIMAT0000432;hsa-miR-141-5p MIMAT0004598;hsa-miR-142-5p MIMAT0000433;hsa-miR-142-3p MIMAT0000434;hsa-miR-143-3p MIMAT0000435;hsa-miR-143-5p MIMAT0004599;hsa-miR-145-5p MIMAT0000437;hsa-miR-145-3p MIMAT0004601
hsa-miR-146a-5p MIMAT0000449;hsa-miR-146a-3p MIMAT0004608;hsa-miR-146b-5p MIMAT0002809;hsa-miR-146b-3p MIMAT0004766;hsa-miR-148a-3p MIMAT0000243;hsa-miR-148a-5p MIMAT0004549;hsa-miR-148b-3p MIMAT0000759;hsa-miR-148b-5p MIMAT0004699;hsa-miR-149-5p MIMAT0000450;hsa-miR-149-3p MIMAT0004609
hsa-miR-150-5p MIMAT0000451;hsa-miR-150-3p MIMAT0004610;hsa-miR-151a-5p MIMAT0004697;hsa-miR-151a-3p MIMAT0000757;hsa-miR-152-5p MIMAT0026479;hsa-miR-152-3p MIMAT0000438;hsa-miR-155-5p MIMAT0000646;hsa-miR-155-3p MIMAT0004658;hsa-miR-181a-5p MIMAT0000256;hsa-miR-181a-3p MIMAT0000270
hsa-miR-181a-5p MIMAT0000256;hsa-miR-181a-2-3p MIMAT0004558;hsa-miR-181b-5p MIMAT0000257;hsa-miR-181b-3p MIMAT0022692;hsa-miR-181b-5p MIMAT0000257;hsa-miR-181b-3p MIMAT0022692;hsa-miR-181c-5p MIMAT0000258;hsa-miR-181c-3p MIMAT0004559;hsa-miR-182-5p MIMAT0000259;hsa-miR-182-3p MIMAT0000260
hsa-miR-183-5p MIMAT0000261;hsa-miR-183-3p MIMAT0004560;hsa-miR-185-5p MIMAT0000455;hsa-miR-185-3p MIMAT0004611;hsa-miR-186-5p MIMAT0000456;hsa-miR-186-3p MIMAT0004612;hsa-miR-187-3p MIMAT0000262;hsa-miR-187-5p MIMAT0004561;hsa-miR-191-5p MIMAT0000440;hsa-miR-191-3p MIMAT0001618
hsa-miR-192-5p MIMAT0000222;hsa-miR-192-3p MIMAT0004543;hsa-miR-193b-3p MIMAT0002819;hsa-miR-193b-5p MIMAT0004767;hsa-miR-194-5p MIMAT0000460;hsa-miR-194-5p MIMAT0000460;hsa-miR-194-3p MIMAT0004671;hsa-miR-195-5p MIMAT0000461;hsa-miR-195-3p MIMAT0004615;hsa-miR-196a-5p MIMAT0000226
hsa-miR-196b-5p MIMAT0001080;hsa-miR-196b-3p MIMAT0009201;hsa-miR-199a-5p MIMAT0000231;hsa-miR-199a-3p MIMAT0000232;hsa-miR-199b-5p MIMAT0000263;hsa-miR-199b-3p MIMAT0004563;hsa-miR-200a-3p MIMAT0000682;hsa-miR-200a-5p MIMAT0001620;hsa-miR-200b-3p MIMAT0000318;hsa-miR-200b-5p MIMAT0004571
hsa-miR-200c-3p MIMAT0000617;hsa-miR-200c-5p MIMAT0004657;hsa-miR-203a-3p MIMAT0000264;hsa-miR-203a-5p MIMAT0031890;hsa-miR-204-5p MIMAT0000265;hsa-miR-204-3p MIMAT0022693;hsa-miR-205-5p MIMAT0000266;hsa-miR-205-3p MIMAT0009197;hsa-miR-210-5p MIMAT0026475;hsa-miR-210-3p MIMAT0000267
hsa-miR-211-5p MIMAT0000268;hsa-miR-211-3p MIMAT0022694;hsa-miR-212-3p MIMAT0000269;hsa-miR-212-5p MIMAT0022695;hsa-miR-214-3p MIMAT0000271;hsa-miR-214-5p MIMAT0004564;hsa-miR-215-5p MIMAT0000272;hsa-miR-215-3p MIMAT0026476;hsa-miR-218-5p MIMAT0000275;hsa-miR-218-1-3p MIMAT0004565
hsa-miR-221-3p MIMAT0000278;hsa-miR-221-5p MIMAT0004568;hsa-miR-223-3p MIMAT0000280;hsa-miR-223-5p MIMAT0004570;hsa-miR-224-5p MIMAT0000281;hsa-miR-224-3p MIMAT0009198;hsa-miR-299-3p MIMAT0000687;hsa-miR-299-5p MIMAT0002890;hsa-miR-301a-3p MIMAT0000688;hsa-miR-301a-5p MIMAT0022696
hsa-miR-301b-3p MIMAT0004958;hsa-miR-301b-5p MIMAT0032026;hsa-miR-302a-3p MIMAT0000684;hsa-miR-302a-5p MIMAT0000683;hsa-miR-302b-3p MIMAT0000715;hsa-miR-302b-5p MIMAT0000714;hsa-miR-302c-3p MIMAT0000717;hsa-miR-302c-5p MIMAT0000716;hsa-miR-302d-3p MIMAT0000718;hsa-miR-302d-5p MIMAT0004685
hsa-miR-320a MIMAT0000510;hsa-miR-323a-3p MIMAT0000755;hsa-miR-323a-5p MIMAT0004696;hsa-miR-324-5p MIMAT0000761;hsa-miR-324-3p MIMAT0000762;hsa-miR-328-5p MIMAT0026486;hsa-miR-328-3p MIMAT0000752;hsa-miR-329-5p MIMAT0026555;hsa-miR-329-3p MIMAT0001629;hsa-miR-330-3p MIMAT0000751
hsa-miR-330-5p MIMAT0004693;hsa-miR-335-5p MIMAT0000765;hsa-miR-335-3p MIMAT0004703;hsa-miR-337-3p MIMAT0000754;hsa-miR-337-5p MIMAT0004695;hsa-miR-338-3p MIMAT0000763;hsa-miR-338-5p MIMAT0004701;hsa-miR-339-5p MIMAT0000764;hsa-miR-339-3p MIMAT0004702;hsa-miR-342-3p MIMAT0000753
hsa-miR-342-5p MIMAT0004694;hsa-miR-345-5p MIMAT0000772;hsa-miR-345-3p MIMAT0022698;hsa-miR-346 MIMAT0000773;hsa-miR-361-5p MIMAT0000703;hsa-miR-361-3p MIMAT0004682;hsa-miR-363-3p MIMAT0000707;hsa-miR-363-5p MIMAT0003385;hsa-miR-365a-3p MIMAT0000710;hsa-miR-365a-5p MIMAT0009199
hsa-miR-367-3p MIMAT0000719;hsa-miR-367-5p MIMAT0004686;hsa-miR-371a-3p MIMAT0000723;hsa-miR-371a-5p MIMAT0004687;hsa-miR-372-5p MIMAT0026484;hsa-miR-372-3p MIMAT0000724;hsa-miR-373-3p MIMAT0000726;hsa-miR-373-5p MIMAT0000725;hsa-miR-374a-5p MIMAT0000727;hsa-miR-374a-3p MIMAT0004688
hsa-miR-375 MIMAT0000728;hsa-miR-376a-3p MIMAT0000729;hsa-miR-376a-5p MIMAT0003386;hsa-miR-376c-3p MIMAT0000720;hsa-miR-376c-5p MIMAT0022861;hsa-miR-378a-3p MIMAT0000732;hsa-miR-378a-5p MIMAT0000731;hsa-miR-379-5p MIMAT0000733;hsa-miR-379-3p MIMAT0004690;hsa-miR-411-5p MIMAT0003329
hsa-miR-411-3p MIMAT0004813;hsa-miR-422a MIMAT0001339;hsa-miR-424-5p MIMAT0001341;hsa-miR-424-3p MIMAT0004749;hsa-miR-425-5p MIMAT0003393;hsa-miR-425-3p MIMAT0001343;hsa-miR-429 MIMAT0001536;hsa-miR-432-5p MIMAT0002814;hsa-miR-432-3p MIMAT0002815;hsa-miR-433-5p MIMAT0026554
hsa-miR-433-3p MIMAT0001627;hsa-miR-449a MIMAT0001541;hsa-miR-449b-5p MIMAT0003327;hsa-miR-449b-3p MIMAT0009203;hsa-miR-450a-5p MIMAT0001545;hsa-miR-450a-1-3p MIMAT0022700;hsa-miR-451a MIMAT0001631;hsa-miR-455-5p MIMAT0003150;hsa-miR-455-3p MIMAT0004784;hsa-miR-483-3p MIMAT0002173
hsa-miR-483-5p MIMAT0004761;hsa-miR-485-5p MIMAT0002175;hsa-miR-485-3p MIMAT0002176;hsa-miR-486-5p MIMAT0002177;hsa-miR-486-3p MIMAT0004762;hsa-miR-487a-3p MIMAT0002178;hsa-miR-487a-5p MIMAT0026559;hsa-miR-487b-5p MIMAT0026614;hsa-miR-487b-3p MIMAT0003180;hsa-miR-491-5p MIMAT0002807
hsa-miR-491-3p MIMAT0004765;hsa-miR-494-5p MIMAT0026607;hsa-miR-494-3p MIMAT0002816;hsa-miR-495-5p MIMAT0022924;hsa-miR-495-3p MIMAT0002817;hsa-miR-497-5p MIMAT0002820;hsa-miR-497-3p MIMAT0004768;hsa-miR-500a-5p MIMAT0004773;hsa-miR-500a-3p MIMAT0002871;hsa-miR-501-5p MIMAT0002872
hsa-miR-501-3p MIMAT0004774;hsa-miR-507 MIMAT0002879;hsa-miR-512-5p MIMAT0002822;hsa-miR-512-3p MIMAT0002823;hsa-miR-512-3p MIMAT0002823;hsa-miR-513a-5p MIMAT0002877;hsa-miR-513a-3p MIMAT0004777;hsa-miR-513c-5p MIMAT0005789;hsa-miR-513c-3p MIMAT0022728;hsa-miR-517a-3p MIMAT0002852
hsa-miR-517-5p MIMAT0002851;hsa-miR-517c-3p MIMAT0002866;hsa-miR-517-5p MIMAT0002851;hsa-miR-518a-3p MIMAT0002863;hsa-miR-518a-5p MIMAT0005457;hsa-miR-518a-3p MIMAT0002863;hsa-miR-518a-5p MIMAT0005457;hsa-miR-518b MIMAT0002844;hsa-miR-518f-3p MIMAT0002842;hsa-miR-518f-5p MIMAT0002841
hsa-miR-522-3p MIMAT0002868;hsa-miR-522-5p MIMAT0005451;hsa-miR-523-3p MIMAT0002840;hsa-miR-523-5p MIMAT0005449;hsa-miR-525-5p MIMAT0002838;hsa-miR-525-3p MIMAT0002839;hsa-miR-539-5p MIMAT0003163;hsa-miR-539-3p MIMAT0022705;hsa-miR-542-5p MIMAT0003340;hsa-miR-542-3p MIMAT0003389
hsa-miR-545-3p MIMAT0003165;hsa-miR-545-5p MIMAT0004785;hsa-miR-548b-3p MIMAT0003254;hsa-miR-548b-5p MIMAT0004798;hsa-miR-548c-3p MIMAT0003285;hsa-miR-548c-5p MIMAT0004806;hsa-miR-548d-3p MIMAT0003323;hsa-miR-548d-5p MIMAT0004812;hsa-miR-557 MIMAT0003221;hsa-miR-558 MIMAT0003222
hsa-miR-572 MIMAT0003237;hsa-miR-574-3p MIMAT0003239;hsa-miR-574-5p MIMAT0004795;hsa-miR-575 MIMAT0003240;hsa-miR-576-5p MIMAT0003241;hsa-miR-576-3p MIMAT0004796;hsa-miR-581 MIMAT0003246;hsa-miR-582-5p MIMAT0003247;hsa-miR-582-3p MIMAT0004797;hsa-miR-584-5p MIMAT0003249
hsa-miR-584-3p MIMAT0022708;hsa-miR-588 MIMAT0003255;hsa-miR-590-5p MIMAT0003258;hsa-miR-590-3p MIMAT0004801;hsa-miR-603 MIMAT0003271;hsa-miR-606 MIMAT0003274;hsa-miR-609 MIMAT0003277;hsa-miR-610 MIMAT0003278;hsa-miR-617 MIMAT0003286;hsa-miR-619-5p MIMAT0026622
hsa-miR-619-3p MIMAT0003288;hsa-miR-622 MIMAT0003291;hsa-miR-625-5p MIMAT0003294;hsa-miR-625-3p MIMAT0004808;hsa-miR-629-5p MIMAT0004810;hsa-miR-629-3p MIMAT0003298;hsa-miR-630 MIMAT0003299;hsa-miR-638 MIMAT0003308;hsa-miR-641 MIMAT0003311;hsa-miR-642a-5p MIMAT0003312
hsa-miR-642a-3p MIMAT0020924;hsa-miR-654-5p MIMAT0003330;hsa-miR-654-3p MIMAT0004814;hsa-miR-661 MIMAT0003324;hsa-miR-663a MIMAT0003326;hsa-miR-744-5p MIMAT0004945;hsa-miR-744-3p MIMAT0004946;hsa-miR-765 MIMAT0003945;hsa-miR-769-5p MIMAT0003886;hsa-miR-769-3p MIMAT0003887
hsa-miR-802 MIMAT0004185;hsa-miR-875-5p MIMAT0004922;hsa-miR-875-3p MIMAT0004923;hsa-miR-888-5p MIMAT0004916;hsa-miR-888-3p MIMAT0004917;hsa-miR-920 MIMAT0004970;hsa-miR-922 MIMAT0004972;hsa-miR-940 MIMAT0004983;hsa-miR-941 MIMAT0004984;hsa-miR-944 MIMAT0004987
>hsa-miR-1244 MIMAT0005896;hsa-miR-1246 MIMAT0005898;hsa-miR-1255a MIMAT0005906;hsa-miR-1262 MIMAT0005914;hsa-miR-1271-5p MIMAT0005796;hsa-miR-1271-3p MIMAT0022712;hsa-miR-1273g-5p MIMAT0020602;hsa-miR-1273g-3p MIMAT0022742;hsa-miR-1273f MIMAT0020601;hsa-miR-1286 MIMAT0005877
hsa-miR-1287-5p MIMAT0005878;hsa-miR-1287-3p MIMAT0026738;hsa-miR-1290 MIMAT0005880;hsa-miR-3138 MIMAT0015006;hsa-miR-3144-5p MIMAT0015014;hsa-miR-3144-3p MIMAT0015015;hsa-miR-3663-5p MIMAT0018084;hsa-miR-3663-3p MIMAT0018085;hsa-miR-3926 MIMAT0018201;hsa-miR-4271 MIMAT0016901
hsa-miR-4327 MIMAT0016889;hsa-miR-5095 MIMAT0020600;hsa-miR-5096 MIMAT0020603;hsa-let-7a-5p MIMAT0000062;hsa-let-7a-3p MIMAT0004481;hsa-let-7b-5p MIMAT0000063;hsa-let-7b-3p MIMAT0004482;hsa-let-7c-5p MIMAT0000064;hsa-let-7c-3p MIMAT0026472;hsa-let-7d-5p MIMAT0000065
Dataset 1.The mature miRNA reference sequences were obtained in FASTA format from the miRBase database.
Dataset 2.Matrix of data containing all the necessary components for the validation of data on CC-associated miRNAs in HPV integration sites in Latin American human genomic variants.

Results

HPV integration sites and chromosomal distribution

A total of 44 publications were identified between 1987 and 2015 related to HPV integration sites in the human genome. The most frequent types of HPV associated with CC were HPV-16 and HPV-18. Details of these articles are outlined in Supplementary File 1. Five hundred and seventy-eight integration sites for 8 types of HPV associated with different histological cervical conditions were identified, of which 63.84% were HPV-16 (Figure 2 and ‘HPV integration sites’ in Dataset 236).

806f98fd-0737-4cbb-a665-60545a43639c_figure2.gif

Figure 2. Chromosomal distribution of integration sites of HPV types (HPV 16, 18, 31, 33, 45, 58, 67 and 68) most frequently reported in the literature.

HPV-16 and HPV-18 have integration sites on all human chromosomes. HPV-16 has more integration sites on chromosomes 2, 1, 3, 6, 9, 5, 8 and 4, while HPV-18 has more on chromosomes 2, 1, 8, 12, 5, 10, 4, 6 and 9. Some less frequently oncogenic HPV types have integration sites on specific chromosomes, such as HPV-45 on 2, 1, 3, 9, 4, 7 and 13; HPV-33 on 9, 13, 5, 6, 8, 11, 16, 18 and X; HPV-58 on 4, 12 and 18; HPV-31 on 2 and 17; HPV-67 on 4 and 13; and HPV-68 on chromosome 18. Chromosomes 1 and 2 displayed a higher number of viral insertion sites (41 and 45, respectively), while chromosomes 13 and 18 displayed insertion sites for 5 different HPV genotypes. The chromosomal loci with the highest numbers of HPV integration sites are presented in Table 2.

Table 2. Chromosomal loci with the highest numbers of HPV integration sites.

CHROMOSOMAL LOCUSHPV INTEGRATION SITESHPV TYPES
8q24.212316,18,45
3q28 y 13q22.1916,18,45
4q13.3716,45
2q34616,18
2q22.3 y 20p12.1516,18
13q21 y 17q12516

Analysis of HPV integration sites near cell cycle regulatory genes

Information on the associated functions of genes located near HPV integration sites obtained from UniProt showed that 86.1% of the genes located in close proximity were involved in apoptosis, cell adhesion, cell differentiation, ion transport and metabolic processes. Fifty-four genes were involved in direct regulation of the cell cycle. Twenty-six of these were tumour suppressor genes, 8 were oncogenes, 8 were proto-oncogenes and 12 did not have a determined functionality in the development of this neoplasia (Figure 3).

806f98fd-0737-4cbb-a665-60545a43639c_figure3.gif

Figure 3. Functional classification of cellular genes in HPV integration sites.

Mapping miRNAs associated with cervical cancer

The 2028 miRNA binding sites associated with CC in the human genome were identified from BLAT mapping using previously identified miRNAs23, including 432 sites previously reported in miRBase (‘Results of mapping with BLAT’ in Dataset 236). These sites were located on both DNA strands (52.97% on the positive strand and 47.03% on the negative strand). 1881 binding sites were fully complementary (100% sequence identity) to miRNA sequences, while 1, 24, and 122 binding sites had 96.2%, 95.7% and 95.5% sequence identity, respectively.

miR-5095 was mapped onto 853 binding sites on 23 chromosomes. Four hundred and twenty-four mature miRNAs sequences (98.15%) mapped to one, two, three and even ten different binding sites. miR-522-5p and miR-523-5p binding sites mapped only a single chromosome (Chr. 19). Table 3 shows the chromosomal location and number of binding sites for each specific miRNA associated with CC.

Table 3. Chromosomal location and frequency of miRNA binding sites associated with CC.

miRNA ASSOCIATED
WITH CC
miRNAs
BINDING SITES
CHROMOSOMAL LOCATION
hsa-miR-5095853All chromosome
hsa-miR-548c-5p194All, except 9
hsa-miR-548d-5p188All, except X, Y
hsa-miR-548b-5p87All, except 3, 4, 5, 6, X, Y
hsa-miR-574-5p62All, except 16, 21, Y
hsa-miR-576-3p154, 5, 8, 9, 12, 13, 15, 18, 22, X
hsa-miR-548c-3p132, 4, 5, 7, 8, 13, 14, X, Y
hsa-miR-1273g-5p111, 3, 7, 9, 10, 11, 13, 14, 15
hsa-miR-95-5p101, 2, 4, 6, 7, 13, X
hsa-miR-124492, 3, 5, 7, 12, 13, 14, 20
hsa-miR-545-3p83, 5, 7, 10, 12, X
hsa-miR-378a-3p73, 5, 10, 11, 14, 17, 18
hsa-miR-522-5p, -523-5p719
hsa-miR-518f-5p65, 19
hsa-miR-545-5p62, 3, 5, 14, 17, X
hsa-miR-151a-5p51, 4, 8, 19, X
hsa-miR-339-5p55, 7, 20, 22
hsa-miR-603410, 13, 14, 16
hsa-miR-7-5p49, 10, 15, 19
hsa-miR-584-5p44, 5, 9, 19

The distribution of the 2028 binding sites was not homogeneous along the human genome. 41% of the total binding sites were identified on chromosomes 1, 19, 5, 2, 3, 14, 7 and X. Although the number of miRNA binding sites correlated with the size of each chromosome, some short chromosomes, such as 19 and X, had more miRNA binding sites when compared to other larger chromosomes (Table 4).

Table 4. Distribution of binding sites in chromosomes identified in miRNAs associated with CC.

CHR= Chromosome.

CHR.NUMBER OF miRNA
BINDING SITES
(%)
1 1758.63
2 1085.33
3 1065.23
4 894.39
5 1115.47
6 874.29
7 1035.08
8 813.99
9 793.90
10 924.54
11 934.59
12 934.59
13 713.50
14 1065.23
15 663.25
16 813.99
17 944.64
18 572.81
19 1316.46
20 422.07
21 271.33
22 291.43
X 1004.93

14.89% (302) of binding sites grouped into the following 19 specific chromosomal locations: (1) 19q13.42 (51 sites/14 miRNAs), (2) 14q32.31 (34 sites/16 miRNAs), (3) 13q31.3 (16 sites/11 miRNAs), (4) 14q32.2 (16 sites/9 miRNAs), (5) 4q25 (16 sites/7 miRNAs), (6) 20q13.33 (15 sites/7 miRNAs), (7) 16p13.3 (15 sites/4 miRNAs), (8) Xq26.2 (14 sites/8 miRNAs), (9) 7q22.1 (14 sites/6 miRNAs) and (10) 1p31.3 (14 sites/6 miRNAs). The remaining 9 chromosomal locations contained between 10 and 13 binding sites (Supplementary File 2). 92% (1865/2028) of the binding sites were distributed into 250 groups along the human genome; the remaining 8% (163/2028) of binding sites for various miRNAs including miR-5095 were distributed along the human genome without being distributed into any groups.

Each group contains between 2 and 7 miRNA binding sites, although some groups contain between 8 and 16 (Figure 4). The majority of the groups are located on chromosomes 1, 2, 3, 5, 10 and 11. The biggest groups are located on chromosome 19, with 51 binding sites for 25 miRNAs involved in CC development.

806f98fd-0737-4cbb-a665-60545a43639c_figure4.gif

Figure 4. Chromosomal distribution of groups of identified miRNA binding sites.

58.8% of miRNA binding sites associated with CC (1194 binding sites) are located in intergenic regions, 39.65% (804 binding sites) in intronic regions, 1.28% (26 binding sites) in exonic regions and 0.19% (4 binding sites) between intronic and exonic regions (mixed miRNAs). Figure 5 shows the variation in the number of intergenic, exonic and intronic miRNAs associated with CC.

806f98fd-0737-4cbb-a665-60545a43639c_figure5.gif

Figure 5. Variation in the number of intergenic, exonic and intronic miRNAs associated with cervical cancer.

miRNA identification in selected HPV integration sites

Thirty-eight integration sites were found for six types of oncogenic HPV (HPV-16, -18, -33, -45, -58 and -68) in miRNA binding sites and cell cycle regulatory genes associated with CC (Table 5). The largest number of HPV integration sites was found for miR-5095 (33 sites), followed by miR-548c-5p (11 sites) and miR-548d-5p (11 sites) (Table 5). In 14 integration sites, no miRNA binding sites were detected. The highest number of miRNA binding sites was found in chromosome regions 18q11.2 and 19p13.12 (Supplementary File 2).

Table 5. miRNAs in HPV integration sites and their correlation with cell cycle regulatory genes.

HPV TYPESHPV INTEGRATION
SITES
miRNAs PRESENT AT SITES OF
INTEGRATION OF HPV1
CELLULAR
GENES2
CL.3
181p22.2miR-548c-5p (-) CDC7 (+)--
181p31.2- GADD45A (+)ST
161p34.1- PLK3 (+)--
161p34.3miR-5095 (3; -,-,+), -548b-5p (-),
-548c-5p (2, -,-), -548d-5p (-)
CDCA8 (+)OG
161q25- TPR (-)--
161q36.32- TP73 (+)ST
16,181q41miR-5095 (2,+,+), -194-5p (-), -215-3p (-),
-215-5p (-), -548b-5p (-)
PROX1 (+)ST
182p15miR-5095 (-) XPO1 (-)ST
162q33.1miR-152-5p(-), -548d-5p(-) ORC2 (-)--
BZW1 (+)--
162q33.3miR-5095 (+) PARD3B (+)ST
162q34miR-5095 (-) BARD1 (-)ST
163p21.31miR-5095 (3;-,+,+), -191-3p (-), -191-5p (-),
-425-3p (-), -425-5p (-)
MAP4 (-)--
163q26.33miR-5095 (2; -,+) SOX2 (+)OG
163q28miR-5095 (-), -944 (+), -28-3p (+),
-28-5p (+)
P3H2 (-)ST
TP63 (+)ST
16, 454q13.3- CXCL8 (+)PO
164q23- EIF4E (-)OG
164q31.21miR-548c-5p (+) FBXW7 (-)ST
165q11.2miR-5095 (3; -,-,+), -449a (-), -449b-3p (-),
-449b-5p (-), -548c-3p (+), -548d-5p (+),
-581 (-)
MAP3K1 (+)ST
165q31.1miR-5095 (-) PPP2CA (-)ST
166p21.31miR-5095 (+) BAK1 (-)ST
166p22.3miR-5095 (4; -,-,+,+), -548c-5p (+),
-548d-5p (2; +,+)
ID4 (+)ST
166q22.32- CENPW (+)--
166q23.3miR-5095 (3; -,+,+) CITED2 (-)ST
167p21.1- AHR (+)ST
187q36.2miR-5095 (-) RHEB (-)PO
188q21.2- E2F5 (+)--
16, 188q21.3- NBN (-)ST
16, 18, 458q24.21miR-5095 (-), -548d-5p (-) MYC (+)PO
168q24.21miR-5095 (-), -548d-5p (-) PVT1 (+)OG
189p21.3miR-5095 (+), -31-3p (-), -31-5p (-),
-491-3p (+), -491-5p (+)
CDKN2A (-)ST
169q22.2miR-5095 (+), -576-3p (2; +,+) CKS2 (+)OG
16, 1810q23.31miR-5095 (-), -107 (-), -103a-3p (-),
-548b-5p (2; -,-), -548d-5p (2; -,-)
PTEN (+)ST
1610q24.2miR-5095 (-), -1287-5p (-) MARVELD1 (+)ST
1612q14.3miR-574-5p (-) CDK4 (-)OG
MDM2 (+)OG
1812q15- HMGA2 (+)PO
5812q24.33- ZNF268 (+)ST
1814q11.2miR-5095 (+), -548c-3p (+), -574-5p (+) HAUS4 (-)--
18, 4514q24.1miR-5095 (2, -,+), -548c-5p (+) RAD51B (+)ST
1815q21.3miR-5095 (2; -,+), -574-5p (-) CCNB2 (+)PO
1616p13.3miR-5095 (12; (7 -, 5+,)), -548c-5p (+),
-572 (-), -940 (+)
TSC2 (+)ST
1617q21.31miR-5095 (3; -,+,+) BRCA1 (-)ST
3318q11.2miR-5095 (-), -1-3p (-), -133a-3p,
-133a-5p (-), -133b, -378a-3p (+),
-548b-5p (-), -548d-5p (-)
TTC39C (+)--
6818q21.1miR-5095 (3; -,+,+), -548c-5p (+),
-548d-5p (+), -574-5p(+)
ZBTB7C (-)ST
1818q21.33miR-5095 (-), -548b-5p (+),
-548c-5p (-), -548d-5p (+)
BCL2 (-)PO
1619p13.12miR-5095 (-), -23a-3p (-), -23a-5p (-),
-27a-3p (-), -27a-5p (-), -181c-3p (+),
-181c-5p (+), -584-5p (+)
NANOS3 (+)--
1620q11.21- TPX2 (+)ST
1620q13.2miR-5095 (-) SRC (+)PO
1621q22.13miR-5095(+), -548d-5p (-) DYRK1A (+)--
1622q12.1miR-548c-5p (+) CHEK2 (-)ST
16, 18, 4522q13.1miR-5095 (2, -,-) MCM5 (+)PO
16Xq25miR-5095 (-), -574-5p (-) DCAF12L2 (-)OG

1The information in brackets shows the number of miRNA binding sites, and whether the miRNAs are located on the positive sense or negative sense DNA strand.

2The information in brackets shows and whether the cell cycle regulatory genes are located on the positive sense or negative sense DNA

3 Cl: Classification of cellular genes; ST: tumor suppressors; OG: Oncogenes; PO: Proto-oncogenes.

Ninety-six possible interactions were identified between 37 mature miRNAs associated with CC and 42 cell cycle regulatory genes located in proximity to the viral insertion sites. The network of interactions is presented in Figure 6. 35.42% of the interactions involved miR-5095, 12.5% involved miR-548c-5p and 12.5% miR-548d-5p.

806f98fd-0737-4cbb-a665-60545a43639c_figure6.gif

Figure 6. The network of interactions between cervical cancer-associated miRNAs and cell cycle regulatory genes present at HPV integration sites.

Rectangles of various colors represent the cell cycle regulatory genes, and color depend on their classification (ST - 806f98fd-0737-4cbb-a665-60545a43639c_f6.gif, OG - 806f98fd-0737-4cbb-a665-60545a43639c_f7.gif, POG - 806f98fd-0737-4cbb-a665-60545a43639c_f8.gif e IND - 806f98fd-0737-4cbb-a665-60545a43639c_f9.gif). The arrows represent the interactions between miRNAs and genes involved in cell cycle regulation; each arrow's color depends on the DNA chain where miRNAs and cell cycle regulatory genes are located.

38.1% of genes identified in HPV integration sites have binding sites for a single miRNA, and 61.9% have binding sites for more than two miRNAs. Table 6 displays genes with more than five miRNA binding sites.

Table 6. Genes associated to 5 or more binding sites of miRNAs.

NUMBER OF miRNA
BINDING SITES
miRNAsGENE
5 sitesmiR-103a-3p, -107, -548b-5p, -548d-5p and -5095 PTEN
miR-194-5p, -215-3p, -215-5p, -548b-5p and -5095 PROX1
7 sitesmiR-449a, -449b-3p, -449b-5p, -548c-3p, -548d-5p, -581 and -5095MAP3K1
8 sitesmiR-1-3p, -133a-3p, -133a-5p, -133b, -378a-3p, -548b-5p, -548d-5p and -5095 TTC39C
miR-23a-3p, -23a-5p, -27a-3p, -27a-5p, -181c-3p, -181c-5p, -584-5p and -5095 NANOS3

A gene may have binding sites for both regions of complementarity (3' and 5') of a miRNA38. In this study, we found that the TTC39C gene has binding sites for miR-133a-3p and miR-133a-5p and MAP3K1 has binding sites for miR-449b-3p and miR-449b-5p, though some mature sequences from one miRNA also showed binding sites to different genes (Figure 6). As an example, the miR-548c-3p mature chain has binding sites in the HAUS4 gene as well as in the MAP3K1, CDCA8, BCL2, ID4, cMYC, RAD51B, TSC2, ZBTB7C, FBXW7, CHEK2 and CDC7 genes (Figure 6).

Identification of miRNAs on Latin American human genomic variants

26.31% (10/42) of the miRNAs analysed (miR-11-3p, miR-31-3p, miR-107, miR-133a-3p, miR-133a-5p, miR-133b, miR-215-5p, miR-491-3p, miR-548d-5p and miR-944) were identical across the Latin American human genome variants, and 73.69% showed a genetic mutation (substitution or deletion of nucleotides) (Figure 7, Panels A and B).

806f98fd-0737-4cbb-a665-60545a43639c_figure7.gif

Figure 7.

A) Number of miRNAs and nucleotide substitutions found in each human genomic variant; B) Number of miRNAs with between 1 and 7 nucleotide substitutions; C) Number of miRNAs with nucleotide substitutions in one, two or three genomic variants in the Latin American human genome, and D) Types of nucleotide substitutions in the miRNA sequences associated with CC in the selected human genome variants.

When mapping the sequences of these miRNAs to the selected Latin American human genome variants (Supplementary File 3), 88 miRSNPs related to miRNAs or miRNA binding sites were identified on the Latin American variants compared with 33 on the reference variant. Twenty-one miRSNPs were located in the miRNA seed sequences of Latin American variants compared with 3 located in the reference variant. The most representative mapping results are shown in Table 6.

Types of nucleotide substitutions in the miRNA sequences associated with CC in the selected human genome variants showed that there were more frequent transversions than transitions and that the most frequent nucleotide substitutions were G→U (16.9%), followed by A→C (15.7%), C→A (15.7%) and G→A (10.8%) (Figure 7).

Between one and 18 nucleotide deletions were detected in miR-27a-3p, miR-31-5p, miR-103a-3p, miR-191-3p, miR-215-3p and miR-574. The sequences of miR-28, miR-152, miR-548c-5p, miR-572 and miR-5095 only mapped to reference sequences (version GRCh38/hg38), but not to any of the Latin American human genomic variants. miR-152 did not map to the PUR variant (Table 6).

Table 7 displays the nucleotide variations from human genome variants obtained from Colombia, Mexico, Peru and Puerto Rico and Bangladesh, which was the control variant.

Table 7. miRNAs identified in HPV integration sites, displaying the nucleotide variations in the selected Latin American human genome variants and the control variant.

More data is available in Supplementary File 3.

HG1 miRNAs IDENTIFIED IN HPV INTEGRATION SITES (Cromosomal location (Chain))2
hsa-mir-1-3p (18q11.2 (-))hsa-mir-23a-3p (19p13.12 (-))
CLM
MXL
PEL
PUR
BEB
UGGAAUGUAAAGAAGUAUGUAU
UGGAAUGUAAAGAAGUAUGUAU
UGGAAUGUAAAGAAGUAUGUAU
UGGAAUGUAAAGAAGUAUGUAU
UGGAAUGUAAAGAAGUAUGUAU
AUCACAUUGCCAGGGAUUUCC
AUCACAUUGCCAGGGAUUUCC
AUAACAUUGCAAGGGAUUUCC
AUCACAUUGCCAGGGAUUUCC
AUCACAUCGCCAGGGAUUUCC
806f98fd-0737-4cbb-a665-60545a43639c_t1.gif 806f98fd-0737-4cbb-a665-60545a43639c_t2.gif
Conserved Nucleotide substitution
hsa-mir-31-5p (9p21.3 (-))hsa-mir-152 (17q21.32 (-))
CLM
MXL
PEL
PUR
BEB
AGGCAAGAUGCUGGCAUAGCU
AGGCAAGAUGCUGGCAUAGCU
AGGCAAGAUGCUGGCAU           
AGGCAAGAUGCUGGCAUAGCU
AGGCAAGAUGCUGGCAUAGCU
CGGGUCUGUGCUACACUCCGACU
                                                 CGACU
AGGUUCUGUGAUACACUACGACU

AGGUUCUGUUGUGCACUCUGACU
806f98fd-0737-4cbb-a665-60545a43639c_t3.gif 806f98fd-0737-4cbb-a665-60545a43639c_t4.gif
Nucleotide deletion Absence of the miRNA sequence

1HG: Human genome; CLM: variant of Medellin, Colombia; MXL: Los Angeles with Mexican ancestry; PEL: Lima, Peru; PUR: of Puerto Rico; BEB: Bengali, Bangladesh.

2The size of each letter indicates the enrichment of each nucleotide in Latin American variants of the human genome, displayed through WebLogo.

Discussion

HPV integration sites

According to the literature, approximately 570 integration sites have been identified for eight oncogenic HPV types associated with CC (Figure 2). HPV integration into cellular DNA and consequent deregulation of genes is considered a crucial step in cancer progression. Genotype HPV-16 is the most studied for its relationship with CC, as it is responsible for 70% of cases worldwide39. This could be a consequence of the greater proportion of integration sites reported for this genotype. In contrast, low risk genotypes, such as HPV-45, -66 and -93 reported in Colombia, are frequent in CC4044.

HPV integration into the host genome occurs in regions well-known as fragile sites, breakpoints or transcriptionally active regions45. This integration induces functional alterations of cellular genes in close proximity12,4648. According to our results, the 8q24.21 chromosome region is the most affected by HPV integration. If we take into account that proto-oncogenes such as the MYC gene are located here49 (as displayed in Figure 3) and that MYC represents a family of genes overexpressed in several tumours including CC4951, inhibition of MYC expression can induce cancer cell destruction50. In this context, the MYC gene could be both a tumour biomarker and potential treatment target for several tumours51 (Table 2).

Chromosomes 1, 14, 19 and X contain significantly more mature miRNAs than others, and chromosome 18 contains fewer miRNAs. The 19q13.4 chromosome region contains the largest group of human miRNAs (known as the group of miRNAs on chromosome 19 "C19 MC"), with alterations in several that have been previously reported in cancer52. Studies have reported associations between chromosome 1 and malignant transformation in cancers, including CC53.

The 578 integration sites identified in eight HPV types associated with CC were located in cell cycle regulatory genes, including the tumour suppressor genes TP73, P3H2, TP63, NBN, PTEN, BRCA1, and TPX2; the oncogenes EIF4E, CDCA8, MDM2, and PVT1; and the proto-oncogenes SRC, MYC, MCM5, CXCL8, and BCL2. Their deregulation could explain the progression of CC (Figure 3).

miRNA binding sites associated with cervical cancer

In 2011, Reshmi et al. used BLAT to determine the exact location of four miRNA binding sites associated with CC using bioinformatics programmes and computational tools54. To the best of our knowledge, this study is the first to use BLAT to identify miRNA binding sites in proximity to HPV integration sites involved in CC progression. In this study, 2028 binding sites from 272 CC-associated miRNAs were identified.

Identification of the target mRNAs of these miRNAs is considered a key step in their structural and functional analysis to establish possible interactions and consequently, cellular processes that may be altered in CC progression5557. miRNAs located in the two strands of cellular DNA (5’ and 3’ strands) demonstrate their ability to interact in both orientations with the two strands of DNA and form triple helix structures to enhance RNA stability58,59.

Each CC-associated miRNA showed a different number of binding sites in the human genome (Table 3, Supplementary File 2), and in the human genomic variants17,21,60,61; miRNAs were distributed throughout the genomes in both intronic or exonic regions13. In this study, CC-associated miRNAs were distributed in the karyosome, with chromosomes 1, 19, 5, 2, 3, 14, 7 and X having the largest number of miRNA binding sites (Table 4). These results are consistent with those reported by Calin et al.12. Because some chromosomes have a greater number of miRNA binding sites, it provides evidence of a non-random distribution of miRNAs within the chromosomes.

Our results showed a low number of exonic miRNAs. These exonic miRNAs are considered rare miRNAs62, which are important candidates for gaining a better comprehension of interaction networks between miRNAs and their CC-associated targets.

The miRNA binding sites are within a short distance of each other in the chromosome, indicating that they tend to cluster6366. Altuvia et al. reported miRNAs in groups of two or three64. This coincides with our results on CC-associated miRNA binding sites, as we found that miRNAs are capable of forming groups of more than 6 miRNAs on both strands of human DNA (Figure 4). We identified an important group of 16 miRNAs that can form these clusters and are located on chromosome 14 region 14q32.31. They include hsa-miR-134, miR-299, miR-323a, miR-329, miR-376a, miR-376c, miR-379, miR-411, miR-485, miR-487a, miR-487b, miR-494, miR-495, miR-539, miR-654 and miR-5095 (Supplementary File 2). Understanding their individual and collective roles is important when studying the development of this neoplasia.

miR-5095 had the highest number of binding sites distributed throughout the human genome (Table 3), which is in accordance with previously reported data6668 where approximately 900 binding sites were identified; they are probably related to the expression of many target mRNAs and biological processes. Based on its extensive genomic distribution and low specificity in CC, miR-5095 is a good candidate to be used as an indicator of genetic variability within the human population.

miRNAs located in HPV integration sites

To identify the role of miRNAs, HPV integration sites located in cell cycle-controlling genes were analysed. Thirty-seven miRNAs were identified in HPV integration sites close to cell cycle-controlling genes (Table 5). Nambaru et al. and Schmitz et al. identified numerous miRNAs in the proximity of HPV integration sites and reported that approximately 65% of these were involved in cervical carcinogenesis8,9. Inactivation of tumour suppressor genes by viral integration increases genomic instability and leads to cervical malignant neoplasm progression69.

The multiple miRNA binding sites on a target may decrease the levels of mRNA translation and improve the specificity of gene regulation. For example, one miRNA can have multiple target genes and each individual mRNA can be regulated by numerous miRNAs13,70,71. Ninety-seven interactions were identified between miRNAs and cell cycle regulatory genes (Table 4Table 5, Figure 4Figure 6); miR-5095, -548c-5p and -548d-5p showed the highest number of interactions with these kinds of genes.

Ivashchenko et al. identified miR-5095 binding sites in the BRCA1 gene67. In this study, miR-5095 was also found to have binding sites in the BAK1, BARD1, CITED2, MDM5, SRC, PARD3B, PPP2CA, RHEB, SOX2 and XPO1 genes (Table 5 and Figure 6). Our findings provide a basis for searching for other interactions, gene targets, and CC-associated miRNAs.

During miRNA biogenesis, each pre-miRNA produces two mature miRNAs, such as miRNA-5p and miRNA-3p72. Mature miRNA deregulation can have an important role in tumour development, suggesting the need to analyse each mature sequence (miRNA-5p and -3p). In this study, binding sites were analysed for both mature miRNA sequences (-5p and -3p) in several interactions (Figure 6). A mature miRNA sequence, such as miR-548c, demonstrated binding sites in different cellular genes. Thus, this miRNA could serve as candidate biomarker for CC prognosis and diagnosis.

Han et al. characterized the two mature chains of miR-21 and their oncogenic roles in cervical cancer73. The regulation of the mature 5p and 3p chains from several miRNAs has been investigated in other cancers, including colorectal, gastric, breast, lung, kidney, and bladder36,72,7477, suggesting the need to focus further studies on the two mature chains from the 272 miRNAs reported in this study.

Figure 6 shows the complexity of the interactions between miRNAs and tumour suppressor genes, proto-oncogenes and oncogenes. The study of interaction networks between cell cycle genes and miRNAs involved in cancer is one of the most recent challenges in systems biology and is important for elucidating the control mechanisms for cancer biological process7881.

miRNAs in HPV integration sites and Latin American human genome variants

The differences in miRNA expression profiles between normal and cancerous tissues have led to the identification of clinical biomarkers for the early detection of many diseases, including various cancers and their precursor stages79,82,83. Research on miRNAs associated with cancer has not taken into account the genetic variability in human populations, which influences the structure, expression and function of miRNAs in populations from different ethnic backgrounds. Studies on genetic variability are relevant to designing strategies for the diagnosis and prognosis of various diseases.

miR-11-3p, miR-31-3p, miR-107, miR-133a-3p, miR-133a-5p, miR-133b, miR-215-5p, miR-491-3p, miR-548d-5p and miR-944 were conserved in the four human genome variants. In the remaining 27 miRNAs, substitutions, deletions or insertions were observed in the nucleotide sequences, indicating that this variability can be decisive when determining susceptibility to the development of CC (Table 7 and Supplementary File 3).

There are numerous studies that analyse miRSNPs in different malignancies8486, but there is no available data on the correlation of SNPs in CC-associated miRNAs located in HPV integration sites in Latin American human genomic variants.

According to our results, the genomes from Latin America showed a lower miRSNP frequency compared to the control genome (BEB), although the Colombian (CLM) genome frequency was more similar to the BEB genome. Latin American populations have experienced migrations from European, Asian and African individuals87. Thus, our results could be a result of the specific interracial mixing of Colombian populations but also due to migration patterns during human settlement in Latin America.

miRSNPs can affect the structure and function of miRNAs by impacting interactions between miRNAs and their mRNA targets or interfering with the expression levels of individual miRNAs2022,88,89. miRSNPs could cause the loss or gain of binding sites for the co-evolution of miRNAs and their target mRNA and even influence cell processes related to tumour progression, disease phenotypes or susceptibility to developing a specific disease.

More studies are needed to clarify the role, targets and transcriptional regulatory mechanisms of cellular events in which miRNA are involved, including differentiation, apoptosis, metabolism and carcinogenesis. The expression and deregulation of miRNAs in cancer as well as their role as biological markers in diagnosis and treatment of CC should be explored. Further identification of cellular genes and signalling pathways involved in CC progression could lead to the development of new therapeutic strategies based on miRNAs90,91. Additional biomarkers associated with apoptosis, necrosis and possible interactions with CRISPR complex sequences can be explored in order to develop therapeutic strategies in the future.

Data availability

Dataset 1. The mature miRNA reference sequences were obtained in FASTA format from the miRBase database. DOI, 10.5256/f1000research.10138.d16473228

Dataset 2. Matrix of data containing all the necessary components for the validation of data on CC-associated miRNAs in HPV integration sites in Latin American human genomic variants. DOI, 10.5256/f1000research.10138.d16473636

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 20 Jun 2017
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Guerrero Flórez M, Guerrero Gómez OA, Mena Huertas J and Yépez Chamorro MC. Mapping of microRNAs related to cervical cancer in Latin American human genomic variants [version 1; peer review: 2 approved with reservations]. F1000Research 2017, 6:946 (https://doi.org/10.12688/f1000research.10138.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 20 Jun 2017
Views
14
Cite
Reviewer Report 28 Sep 2017
Subhash Mohan Agarwal, Bioinformatics Division, ICMR-National Institute of Cancer Prevention and Research, Noida, Uttar Pradesh , India 
Approved with Reservations
VIEWS 14
In the present study the authors have mapped the miRNA involved in cervical cancer on to Latin American genome using in silico predictions. As cervical cancer has the highest mortality rates in low and middle income countries we do need ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Agarwal SM. Reviewer Report For: Mapping of microRNAs related to cervical cancer in Latin American human genomic variants [version 1; peer review: 2 approved with reservations]. F1000Research 2017, 6:946 (https://doi.org/10.5256/f1000research.10920.r24293)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 18 Sep 2023
    Milena Guerrero, Department of Biology, Center for Health Studies at the University of Nariño (CESUN), University of Nariño, Pasto, Nariño, Colombia
    18 Sep 2023
    Author Response
    Dear Reviewer SMA.
    We resubmitted the second version of the paper after addressing the various concerns raised.
    We would like to thank for their time and for their constructive comments ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 18 Sep 2023
    Milena Guerrero, Department of Biology, Center for Health Studies at the University of Nariño (CESUN), University of Nariño, Pasto, Nariño, Colombia
    18 Sep 2023
    Author Response
    Dear Reviewer SMA.
    We resubmitted the second version of the paper after addressing the various concerns raised.
    We would like to thank for their time and for their constructive comments ... Continue reading
Views
26
Cite
Reviewer Report 11 Jul 2017
Juan Manuel Anzola, Bioinformatics & Computational Biology, Corporación CorpoGen, Bogotá, Colombia 
Approved with Reservations
VIEWS 26
In this work, Guerrero et al. use mature microRNA in order to detect possible targets of these microRNAs in the human genome, and its population variants, including from Latin American, in order to determine possible associations with cervical cancer. 
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Anzola JM. Reviewer Report For: Mapping of microRNAs related to cervical cancer in Latin American human genomic variants [version 1; peer review: 2 approved with reservations]. F1000Research 2017, 6:946 (https://doi.org/10.5256/f1000research.10920.r23646)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 18 Sep 2023
    Milena Guerrero, Department of Biology, Center for Health Studies at the University of Nariño (CESUN), University of Nariño, Pasto, Nariño, Colombia
    18 Sep 2023
    Author Response
    We resubmitted the second version of paper after addressing the various concerns raised.
    We would like to thank you for their time and for their constructive comments to help assist ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 18 Sep 2023
    Milena Guerrero, Department of Biology, Center for Health Studies at the University of Nariño (CESUN), University of Nariño, Pasto, Nariño, Colombia
    18 Sep 2023
    Author Response
    We resubmitted the second version of paper after addressing the various concerns raised.
    We would like to thank you for their time and for their constructive comments to help assist ... Continue reading

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 20 Jun 2017
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.