Variable expression and silencing of CRISPR-Cas9 targeted transgenes identifies the AAVS1 locus as not an entirely safe harbour

Background: Diseases such as hypertrophic cardiomyopathy (HCM) can lead to severe outcomes including sudden death. The generation of human induced pluripotent stem cell (hiPSC) reporter lines can be useful for disease modelling and drug screening by providing physiologically relevant in vitro models of disease. The AAVS1 locus is cited as a safe harbour that is permissive for stable transgene expression, and hence is favoured for creating gene targeted reporter lines. Methods: We generated hiPSC reporters using a plasmid-based CRISPR/Cas9 nickase strategy. The first intron of PPP1R12C, the AAVS1 locus, was targeted with constructs expressing a genetically encoded calcium indicator (R-GECO1.0) or HOXA9-T2A-mScarlet reporter under the control of a pCAG or inducible pTRE promoter, respectively. Transgene expression was compared between clones before, during and/or after directed differentiation to mesodermal lineages. Results: Successful targeting to AAVS1 was confirmed by PCR and sequencing. Of 24 hiPSC clones targeted with pCAG-R-GECO1.0, only 20 expressed the transgene and in these, the percentage of positive cells ranged from 0% to 99.5%. Differentiation of a subset of clones produced cardiomyocytes, wherein the percentage of cells positive for R-GECO1.0 ranged from 2.1% to 93.1%. In the highest expressing R-GECO1.0 clones, transgene silencing occurred during cardiomyocyte differentiation causing a decrease in expression from 98.93% to 1.3%. In HOXA9-T2A-mScarlet hiPSC reporter lines directed towards mesoderm lineages, doxycycline induced a peak in transgene expression after two days but this reduced by up to ten-thousand-fold over the next 8-10 days. Nevertheless, for R-GECO1.0 lines differentiated into cardiomyocytes, transgene expression was rescued by continuous puromycin drug selection, which allowed the Ca 2+ responses associated with HCM to be investigated in vitro using single cell analysis. Conclusions: Targeted knock-ins to AAVS1 can be used to create reporter lines but variability between clones and transgene silencing requires careful attention by researchers seeking robust reporter gene expression.


Introduction
A key consideration for targeted gene delivery in human induced pluripotent stem cells (hiPSCs) is the genomic location at which to insert the exogenous DNA sequence to maximise transgene expression and limit disruption of critical endogenous genes and their function. To this end, a number of chromosomal locations that are amenable to integration have been exploited. These regions of the genome are commonly referred to as safe harbour loci, and often share some common properties such as limited disruption to endogenous genes, low proximity to oncogenes and a chromatin structure that is not prone to epigenetic silencing 1,2 .
Examples of previously utilised genomic safe harbour loci include the chemokine (C-C motif) receptor 5 (CCR5) gene 3,4 , the human orthologue of the mouse Rosa26 locus (hROSA26) 5 , and a region within intron 2 of the Citrate Lyase Beta-Like (CLYBL) gene 6 .
The AAVS1 locus is an area of chromosome 19 (position 19q13.42) that has been found to be a common integration site for exogenous DNA delivered to cultured cells with adenoassociated virus (AAV) 7,8 . Integration into this site is associated with only limited disruption of endogenous genes. The phosphatase 1 regulatory subunit 12C (PPP1R12C) gene codes for a protein with a poorly defined function, and its first intron is disrupted by integration into the AAVS1 site, with no observed deleterious consequences in targeted human pluripotent stem cells (hPSCs) 2,9 . DNA sequences inserted at this location are supposedly protected by endogenous insulator regions 10 . These insulators are considered to contribute to maintaining an open chromatin conformation at the AAVS1 locus, reducing the likelihood of transgene silencing compared to other safe harbour loci such as CCR5 4,11 . However, some reports of DNA methylation dampening transgene expression in both hPSC-derived hepatocytes 12 and iPSC-derived myeloid progenitors 32 raise questions on whether a 'perfect' safe harbour locus exists. Despite this, AAVS1 has remained popular for gene targeting [13][14][15][16] .
We sought to utilise CRISPR/Cas9 nickase to target the AAVS1 locus in hiPSCs and introduce a genetically encoded calcium indicator, R-GECO1.0, to enable live Ca 2+ imaging in hiPSC-derived cardiomyocytes (hiPSC-CMs) 18,19 . This was performed in genome engineered isogenic hiPSC lines we previously described to model the condition, hypertrophic cardiomyopathy (HCM). These included a trio of lines harbouring a c.MYH7 C9123T mutation 20 and a duo harbouring a c.ACTC1 G301A mutation 21 .
In addition, CRISPR Cas9 targeting of the AAVS1 locus was used to target a doxycycline-inducible HOXA9-T2A-mScarlet cassette into hiPSCs with the aim of modulating HOXA9 during haematopoietic differentiation. HOXA9 is a transcription factor regulated spatio-temporally during haematopoietic or cardiac development 22 and the aim was to examine if controlled supplemental expression of HOXA9 resulted in more efficient production of mature cells.
We found that, far from being a safe harbour locus, AAVS1 integration associated with transgene expression that varied between clones and/or was silenced during directed differentiation towards both haematopoietic cells and cardiomyocytes. This suggests that silencing at the AAVS1 locus is not limited to the endoderm lineage as previously described 12 . Nevertheless, by altering our methods from bulk population analysis to single cell confocal laser line scan microscopy, we used the hiPSC-CMs expressing R-GECO1.0 to investigate the impact of HCM mutations on Ca 2+ transients. Abnormalities were found in both HCM-associated mutations c.MYH7 C9123T and c.ACTC1 G301A , and this phenotype was successfully rescued with drug treatment. This demonstrates an in vitro alternative to some aspects of drug testing on animal models of HCM. Finally, we conclude that the AAVS1 locus cannot be considered a true safe harbour. Researchers seeking to target this locus should check clones for transgene expression status both in hiPSCs and in differentiated progeny.

Ethical statement
Informed patient consent was obtained for all patient-derived hiPSC samples to be used for research purposes. Isolation and use of patient fibroblasts was approved by the Nottingham Research Ethics Committee (License 09/H0408/74), and sample collections are registered with the UK Clinical Research Network under project 8164.
hiPSC culture and differentiation All cell culture experiments were performed in a type II Biological Safety Cabinet, and cells were incubated in a humidified incubator at 37°C and 5% CO 2 . hiPSCs were routinely maintained in E8 medium on 1:100 Matrigel (Corning #356235) coated plastic ware (Nunc). Cells were passaged every three days by washing once with Ca 2+ /Mg 2+ -free Phosphate Buffer Saline (PBS, Gibco #14190-094), followed by incubation with TrypLE for four minutes. Subsequently, hiPSC were resuspended in E8 supplemented with 10 μM Y-27632 (ROCKi, Tocris

Amendments from Version 1
We thank both reviewers for their constructive and helpful comments regarding the original manuscript. The amendments to the manuscript include changing the pseudocolouring of immunocytochemistry images in Figure 2 and Figure 3 such that R-GECO is always pseudocoloured red to aid consistency for the reader. Figure 5 has also been amended to consistently display scale bars on the confocal line scan kymographs for time and laser line length. Additional data has been included in Extended Data including original PCR genotyping of clones to detect AAVS1 integration, details of the sgRNAs and how they were designed and an alternative graphical representation of Figure 4 to show DeltaCt rather than relative quantification. More clarity was provided on the function of the HOXA9-mScarlet line, as well as the CRISPR targeting strategy to generate the isogenic sets of HCM mutant hiPSCs. In addition, the discussion has been expanded to note a recent paper, published weeks after the original manuscript, which describes AAVS1 transgene silencing in iPSC-derived myeloid cells. CRISPR-Cas9 gene targeting of the AAVS1 locus In order to target the AAVS1 locus in hiPSCs, a targeting vector was constructed containing either the CAG-R-GECO1.0-IRES-Puro cassette 18 or the doxycycline-inducible HOXA9-T2A-mScarlet-CAG-G418 cassette flanked on each side with 1 kb of homology to the AAVS1 locus 24 . 1 μg of AAVS1 targeting vector was transfected into 1 × 10 6 hiPSCs, with 500 ng of each AAVS1 guide RNA pU6 vector and 1 μg of hCas9 D10A nickase plasmid using an Amaxa 4D system (Lonza) according to the manufacturer's instruction. 24 hours after transfection, the medium was supplemented with 0.3 μg/ml puromycin (Life Technologies #A1113802) or 50 μg/ml Geneticin™ (Life Technologies #10131027) depending on the drug selection cassette for positive selection of clones up to 10 days post-transfection. Drug-resistant clones were then isolated using 0.5 mM EDTA and expanded. Clones were then genotyped using polymerase chain reaction (PCR) on genomic DNA using Phusion® polymerase (NEB Cat# M0530S) and the primers given in Table 1. PCR cycle parameters were 95°C for 2 minutes, 60-64°C for 30 seconds and 72°C for 60 seconds, with a final elongation step of 72°C for 10 minutes.
Live imaging mScarlet expression HOXA9 and mScarlet expression was induced with the addition of 1 μg/ml doxycline every 48 hours. Live imaging of mScarlet fluorescence in differentiating hiPSCs was performed using Operetta™ high-content image analysis every two days. All images were taken using a 20x objective. Brightfield images were taken using 100 ms exposure time. mScarlet imaging was performed using 400 ms exposure and an excitation wavelength of 520-550 nm and an emission wavelength of 560-630nm. Data analysis was performed using Columbus™ software (PerkinElmer) to identify and quantify cell regions expressing mScarlet fluorescence.

Gene expression analysis by qPCR
Real-time qPCR reactions were performed using TaqMan ® Gene Expression Assays (Applied Biosystems) following manufacturer's instructions. Briefly, Taqman ® mastermix (#4369016) including the HOXA9 probe 25 was added to a MicroAmp Fast 96-well plate (#4346907). Subsequently, cDNA samples (from initial 500 ng of reverse-transcribed RNA) were added to the plate, which was thereafter sealed with a film (#4311971). Amplification was performed in ABI 7500 Real-Time PCR system (Applied Biosystems). Cycle conditions were 50°C for 2 minutes, 95°C for 10 minutes followed by 40 cycles of 95°C for 15 seconds and 60°C for 1 minute.). Normalisation was performed using the housekeeping gene B2M or PP1A.
Relative quantification was calculated using the ΔΔCT method 26 in Microsoft Excel.
Confocal analysis and ClampFit identification of abnormal Ca 2+ transients hiPSCs were differentiated as previously described 20,21 in RPMI B27 without phenol red and dissociated on day 15 by collagenase treatment. On day 30, hiPSC-CMs were seeded at a density of 150,000 cells per well in vitronectin-N coated MatTek dishes. Intracellular Ca 2+ transient measurements were made using an LSM 880C confocal microscope (Carl Zeiss) in the line-scan mode as previously described 27 . CMs were located using a 40x oil objective and a longitudinal line was drawn across a single CM. The R-GECO fluorophore was excited with a 561 nm laser at 0.8% power, with a detection range of 579 -639 nm. Line-scan images were taken every 75 milliseconds, with a pixel dwell time of 4.12 μsec, for a total of 4000 cycles resulting in a five minute scan. CMs were kept at 37°C and 5% CO 2 and allowed to spontaneously beat throughout data acquisition.
Confocal line scan images were analysed in FiJi software, a version of ImageJ (National Institute of Health). The average fluorescence intensity of each line was calculated against time to give a confocal line-scan trace. Using the 'multi kymograph' function, a corresponding kymograph image was produced. In order to calculate beat rate and arrhythmic events, data was fed into pClamp software (Molecular Devices). Baselines were adjusted to account for photobleaching and Ca 2+ transients were counted and analysed using the 'event detection' function. Using the event viewer, any Ca 2+ transients that did not return to baseline and gave a 'double peak', or did not return to at least 75% of the previous Ca 2+ transient amplitude, were considered 'abnormal', as described in 21.

Statistical analysis
All data presented as mean with standard deviation unless otherwise stated. Statistical analysis of multiple data sets was performed using GraphPad software (version 7.04).
For multiple comparisons between data sets a one-way ANOVA with Tukey's multiple comparison test was chosen. For comparing multiple data sets to a single control column, one-way ANOVA with Dunnett's multiple comparison test was chosen. Significance tests were based on p-values as follows: * p < 0.05; ** p < 0.01; *** p < 0.001; **** p < 0.0001.

Results
Knock-in of transgenes into the AAVS1 locus Our overarching goal was to create two isogenic sets of hiPSC lines in order to study Ca 2+ handling in the context of in vitro models of the disease HCM. One isogenic trio comprised lines that were originally wild-type (MYH7 WT/WT ), and then CRISPR Cas9 edited to generate heterozygous (MYH7 WT/MUT ) and homozygous (MYH7 MUT/MUT ) mutants for the c.MYH7 C9123T mutation 20 . The other comprised a pair that were heterozygous originally patient-derived (ACTC1 WT/MUT ) and corrected (ACTC1 WT/WT ) for the c. ACTC1 G301A mutation and CRISPR Cas9 corrected (ACTC1 WT/WT ) 21,28 .
The AAVS1 locus, located within the first intron of PPP1R12C on chromosome 19 ( Figure 1A), is a well characterised safe harbour locus 2 . Using the five lines above, we targeted a cassette containing R-GECO1.0 reporter and a puromycin resistance cassette, driven by the CAG promoter, into the AAVS1 locus ( Figure 1A) 24 . This was achieved by using a CRISPR-Cas9 nickase approach based on two sgRNAs. Nucleofection of the HCM-associated hiPSCs with the R-GECO1.0 construct, bidirectional sgRNAs and Cas9 D10A nickase plasmids produced puromycin-resistant clones. PCR-based screening and sequencing were used to examine the regions upstream ( Figure 1B) and downstream ( Figure 1C) of the insertion site, and hence identify clones that were successfully targeted in one or both alleles ( Figure 2B, C).
In addition, the MYH7 WT/WT hiPSC line was used to introduce a HOXA9-T2A-mScarlet cassette driven by a doxycyclineinducible pTRE promoter into the AAVS1 locus using the same CRISPR-Cas9 nickase approach ( Figure 1A). Both 5' ( Figure 1D) and 3' integration ( Figure 1E) to AAVS1 was assessed using the same PCR genotyping approach on genomic DNA to identify successfully targeted clones.

Variability in transgene expression at the AAVS1 locus
In order to quantify R-GECO1.0 expression across the targeted clones, high content image analysis was used on hiPSCs that were dual-stained with anti-OCT4 for pluripotency and anti-RFP antibody, which identifies R-GECO1.0 ( Figure 2A) 24 . Schematic illustrating the chromosomal location of the AAVS1 safe harbour locus. This site was targeted using two sgRNAs in a CRISPR Cas9 nickase strategy. PAM site #1 was silently mutated (G→C) in the targeting construct to prevent it being cut by Cas9 nuclease during targeting. The inserted cassette consists of R-GECO1.0 IRES-Puromycin driven by the CAG promoter. This is flanked on each side by 1 kb of homology to the AAVS1 locus. In (B) and (C) confirmatory 5' and 3' targeting PCR screen is shown using genomic DNA isolated from the MYH7 C9123T RGECO1.0 isogenic trio (left) and the ACTC1 G301A RGECO duo (right) hiPSCs. Correct 5' targeting is indicated with a 1221bp product, with sequencing confirming the fidelity of the junction between the AAVS1 left arm homology and the start of the CAG promoter. Correct 3' targeting is indicated with an 1186bp product, with sequencing confirming the fidelity of the junction between the puromycin-SV40 pA sequence and the AAVS1 right arm homology. (D, E) Confirmatory PCR and sequencing of hiPSC clones to check 5' and 3' targeting of the AAVS1 locus with the HOXA9-T2A-mScarlet cassette. Transgene expression varied widely both between, and within, cell lines. The percentage of cells expressing R-GECO1.0 in AAVS1-targeted ACTC1 WT/WT hiPSC clones ranged from a maximum of 96.4% to a minimum of 32.6% (Figure 2A), and in isogenic mutant ACTC1 WT/MUT hiPSC clones from 49.4% to 0%. Selected clones for the isogenic trio of MYH7 WT/WT , MYH7 WT/MUT and MYH7 MUT/MUT hiPSCs showed comparatively high R-GECO1.0 expression exceeding 73.09% in all cases, an important requirement for a more faithful comparison between lines ( Figure 2A).
This variability could not be explained by the incidence of biallelic targeting, as determined by PCR screening. Both alleles were targeted in ACTC1 WT/WT clones three and six, which exhibited high R-GECO1.0 expression as hiPSCs of 91.1% and 85.4%, respectively. This was comparable with the 95.6% and 96.4% expression observed in the monoallelically targeted clones 11 and 12, respectively ( Figure 2A and 2B). Variability between clones was also seen upon differentiation, with a MYH7 MUT/MUT Hom 2 clone showing 93.1% R-GECO1.0 expression as hiPSC-CMs, significantly greater than the 2.1% expression observed in the MYH7 MUT/MUT Hom 3 clone (**** p = < 0.0001) ( Figure 2D).
Taken together, these results highlight that expression levels of transgenes seen in hiPSCs can vary significantly between AAVS1-targeted clones, which continued to be observed upon differentiation. Importantly, the level of variability could not be predicted and needed to be tested empirically.
Next, we sought to investigate the time at which silencing occurs during differentiation. To do this, we used the AAVS1targeted doxycycline-inducible HOXA9-T2A-mScarlet line and differentiated the hiPSCs towards either cardiomyocyte or haematopoietic fate. As expected, the addition of 1 μg/ml doxycycline every 48 hours induced expression of HOXA9 and mScarlet during directed cardiac and haematopoietic differentiation. qRT-PCR analysis of HOXA9 expression showed an increase of 22738-fold higher expression compared to untargeted hiPSC control on day 0 of cardiomyocyte differentiation ( Figure 4A) 24 . However, HOXA9 expression decreased thereafter so that by day 10, expression levels were only 175-fold greater than untargeted hiPSC control. Similarly, qRT-PCR analysis during haematopoietic differentiation revealed peak expression of HOXA9 occurring on day two, with 45666-fold greater expression than untargeted hiPSC control, decreasing thereafter to 64-fold expression on day 12 ( Figure 4D). These results were mirrored at the protein level, where early peak expression of mScarlet fluorescence occurred on day 0 of cardiomyocyte differentiation ( Figure 4B and 4C) and on day two of haematopoietic differentiation ( Figure 4E and 4F), decreasing at later timepoints of differentiation.
For the haematopoietic differentiation, expression of the key mesoderm markers MIXL1 and Brachyury peaked on day two ( Figure 4G and 4H). This suggests that silencing of the AAVS1 locus can occur immediately after mesoderm patterning. As a whole, these results show a progressive silencing of transgene expression as mesoderm differentiation progresses.
AAVS1-targeted R-GECO1.0 expressing clones as a tool for in vitro disease modelling and drug screening Despite some obstacles due to unanticipated AAVS1 silencing, isogenic R-GECO1.0 expressing clones in genetic backgrounds associated with HCM were successfully generated and used to image Ca 2+ transients using confocal laser line scan microscopy.
As this technique involves assaying single cells, even poorly expressing clones were useful. By monitoring the fluctuation of R-GECO1.0 fluorescence over time, Ca 2+ transient traces could be generated for each line ( Figure 5A-C and 5E-F) 24 . Despite some expected variability in spontaneous beat rate between wild-type hiPSC-CMs ( Figure 5A and 5E) 29 , for both the c.MYH7 C9123T and c.ACTC1 G301A mutations, increasing mutation load resulted in an increased incidence of abnormal Ca 2+ transient events. MYH7 WT/WT hiPSC-CMs only presented 1.1% aberrant Ca 2+ transient events, increasing to 4.33% in MYH7 WT/MUT hiPSC-CMs, and further increasing to 11.2% in MYH7 MUT/MUT hiPSC-CMs (p < 0.0001) ( Figure 5D). This represented a ten-fold increase in the occurrence of aberrant Ca 2+ transients in the homozygous mutant compared to isogenic wild-type control. Likewise, 15.65% of Ca 2+ transients in ACTC1 WT/MUT hiPSC-CMs were calculated as being aberrant, compared to 6.7% (±0.6%) in ACTC1 WT/WT isogenic control hiPSC-CMs (p = 0.0118) ( Figure 5G). This demonstrated the utility of the AAVS1-targeted R-GECO1.0 cell lines for in vitro disease modelling and phenotyping as a credible alternative to the use of animal models.
We then attempted to rescue the aberrant Ca 2+ transient phenotype in our HCM models with the use of a combination treatment of ranolazine, a late sodium channel blocker, and dantrolene, a ryanodine receptor antagonist. In combination at 10 μM with 24 hours incubation, these two drugs significantly   These results show that abnormalities in Ca 2+ transients caused by the sarcomeric mutations c.MYH7 C9123T or c.ACTC1 G301A can be identified with AAVS1-targeted R-GECO1.0 expression, and this phenotype can be subsequently rescued with targeted pharmacological intervention aimed at reducing intracellular Na + and Ca 2+ .

Discussion
Precise integration of exogenous DNA into the genome is often performed by targeting a genomic 'safe harbour' locus that can tolerate gene insertion with few deleterious effects and limited transgene silencing. The AAVS1 locus is a popular choice for targeted knock in of exogenous DNA 16,17 . This region of the genome is claimed to facilitate robust and persistent transgene expression 11 , aided by flanking insulator regions 10 . Here, we show variable success targeting the AAVS1 locus with the genetically encoded calcium indicator R-GECO1.0 or a doxycycline-inducible HOXA9-T2A-mScarlet cassette using a CRISPR Cas9 nickase approach.
Variability in R-GECO1.0 expression between AAVS1-targeted clones as hiPSCs was observed, highlighting the importance of thorough screening of clones. Unsurprisingly, clones that had undergone biallelic targeting retained high R-GECO1.0 expression as hiPSCs, yet monoallelically targeted clones ranged from high expression to significantly reduced R-GECO1.0 expression. These incidences of low expression as hiPSCs may be due to clone-specific silencing, or some clones favouring expression from the untargeted allele. It has been shown that some genes within cells favour monoallelic expression 30 . Indeed, our own studies using an antibody for the c.ACTC1 G301A mutation have shown that cells heterozygous for the mutation only express mutant protein in ~50% of the population 21 . These results highlight the heterogeneity that can exist between clones once they have been generated.
Even with the identification of AAVS1-targeted clones that exhibited robust R-GECO1.0 expression, there were instances of silencing upon differentiation to hiPSC-CMs. Transgene silencing at the AAVS1 locus has previously been shown upon differentiation towards hepatocyte-like cells, with de novo methylation of the locus found to be responsible 12 . However, the aforementioned report claims that this silencing effect is restricted to endoderm differentiations. With the use of the AAVS1-targeted HOXA9-T2A-mScarlet cell line, we show that in two mesoderm-specific differentiations towards cardiomyocytes and haematopoietic cells, silencing of expression occurs immediately after mesoderm specification and peak expression of MIXL1 and Brachyury. This is in agreement with a recent report which demonstrated differential transgene methylation in AAVS1-targeted iPSC-derived myeloid cells 13 . We therefore postulate that AAVS1-mediated silencing, likely as a result of de novo methylation, can occur upon differentiation to various germ layers, and therefore AAVS1 cannot be considered a true safe harbour locus.
The choice of promoter may also play a role in AAVS1-mediated silencing, as a previous report has shown AAVS1 silencing of eGFP expression when using the EF1a promoter that was overcome with the use of the stronger CAG promoter 31 . This influenced our decision to opt for the CAG promoter over the EF1a promoter in our constructs. Indeed, the CAG promoter appears to exhibit some insulation from methylation of transgenes at the AAVS1 locus 12 . In addition, a recent report elegantly demonstrated that contextual silencing at the AAVS1 locus of iPSC-derived myeloid precursors can occur with varying efficiency depending on the chosen promoter 13 .
When attempting to express two separate transgenes from the same cassette at the AAVS1 locus, the choice of peptide cleavage sequence is also important. We observed inefficient peptide cleavage when using the P2A sequence, as determined by Western Blot (see Extended data) 24 . This informed our choice of using the T2A or IRES sequences for translation of multicistronic cassettes in subsequent targeting constructs. It has been claimed that the P2A cleavage sequence is the most efficient self-cleaving peptide, followed by T2A and E2A, when used to cleave a bicistronic vector in three different human cell lines, including HeLa 32 . We cannot reconcile this with our data, as co-expression of multicistronic cassette elements from the AAVS1 locus in hiPSCs has been achieved using the IRES and T2A sequence, but not the P2A sequence.
Depending on the application of the targeted cell lines, some AAVS1-mediated silencing can be tolerated. For the R-GECO1.0 expressing lines, maintaining puromycin selection and using a single cell confocal laser line scan assay to study Ca 2+ transients meant that abnormalities in Ca 2+ handling could be identified in cell lines harbouring the c.MYH7 C9123T or c.ACTC1 G301A mutations. This represents an in vitro model of HCM that can offer an alternative to the use of animal models. Furthermore, this phenotype could be rescued with combination treatment with dantrolene and ranolazine 20,21 . However, the aim with the doxycycline-inducible HOXA9-T2A-mScarlet cell line was to modulate HOXA9 expression at different timepoints throughout haematopoietic differentiation. Clearly, the clone described herein is not suitable for this task.
In conclusion, one must carefully select AAVS1-targeted clones depending on their application due to the risk of transgene silencing upon differentiation. Other potential safe harbour loci, such as CLYBL, have been claimed to deliver five-to ten-fold higher fluorescent transgene expression than AAVS1 6 . However, as silencing cannot be predicted, multiple clones must be thoroughly checked for expression level and chosen according to their application. Our results dispute the claims of robust and persistent transgene expression from AAVS1 11 , and complements reports that show silencing at AAVS1 upon differentiation to endoderm lineage 12 , by showing similar silencing upon mesoderm differentiation. This project contains the following underlying data: - Figure 1 -data (raw PCR integration gel images in TIF format; and sequencing chromatograms underlying Figure 1 as AB1

Sara Howden Murdoch Children's Research Institute (MCRI), Melbourne, Vic, Australia
In this study, the authors clearly show that targeting transgenes to the "safe harbour" AAVS1 results in highly variable transgene expression in iPSCs, which is exacerbated even further following differentiation. This has important implications for researchers contemplating this locus for generating iPSC lines to stably express a transgene of interest (e.g fluorescent reporter, Cas9, transcription factor etc) The study appears well executed and their conclusions are well supported by their findings. One minor comment but certainly not a deal breaker, it would have been nice to see clones analysed by flow cytometry. This would not only give a very accurate picture of the number of cells expressing the given reporter but also a more accurate picture of the level of transgene expression (i.e median peak fluorescence) not only between clones but within clones. Also, might be helpful if the authors indicate how they separate endogenous HOXA9 expression from exogenous (transgene) expression. All in all, a great little study with very useful data/implications for those in the field!

Are sufficient details of methods and analysis provided to allow replication by others? Partly
If applicable, is the statistical analysis and its interpretation appropriate? I cannot comment. A qualified statistician is required.

Are the conclusions drawn adequately supported by the results?
Yes Are the conclusions drawn adequately supported by the results?

Yes
No competing interests were disclosed. 1.

2.
3. In this paper, the authors aim to investigate the effect of hypertrophic cardiomyopathy associated mutations and on Ca transients using hiPSCs-derived cardiomyocytes. MYH7C ACTC1 To this end, the authors established hiPSCs reporters by introducing a CAG promotor-controlled calcium indicator (R-GECO1.0) into the locus through CRISPR/Cas9 nickase-mediated genome editing. AAVS1 Among 24 clones, 20 were found to successfully express the transgene with variation from about 0 to 99.5%. Upon differentiation to cardiomyocytes, even clones with high expression levels demonstrated significant silencing of the transgene to 13.03% or 1.33%. By creating also an targeted AAVS1 doxycycline inducible HOXA9-T2A-mScarlet iPSCs reporter, the authors investigated the silencing over the time course during mesoderm lineage commitment. It was shown that both mRNA and protein levels of the transgenes were relatively high on day 2 and abruptly decreased from day 4 during both cardiomyocyte and hematopoietic differentiation. Despite the silencing of R-GECO1.0 in the locus, Ca live image by confocal laser line scan AAVS1 microscopy at the single cell level detected abnormal Ca transients in cardiomyocytes derived from hiPSCs reporters harboring or mutation compared to wildtype or isogenic control. In MYH7C ACTC1 s addition, this abnormality could be rescued by pharmacological inhibiting intercellular level of Na+ and Ca . This study shows that iPSC reporter line is of importance for disease modelling. This study presented an important issue in -targeted transgene expression in iPSCs and mesoderm lineage although the AAVS1 mechanisms for the silencing are not clear. The variable expression and silencing in mesoderm differentiation shown here are in line with previous reports that the AAVS1 is not a true safe harbor for cells differentiated to hematopoietic cells (e.g. PMID: 31773990) and endoderm (e.g. PMID: 26455413); and, hence, the findings are not fully novel.

Specific comments:
The authors did not mention why for they compared wild-type ( ), heterozygous ( MYH7 MYH7 ) and homozygous ( ) for the c. mutation, but no homozygous MYH7 MYH7 mutant line for . ACTC1 In Fig 2A, the expression of R-GECO in clones (1/5 with about 50% cells) was ACTC1 lower than in clones. Is this chance or due to the mutation? ACTC1 In Fig 2A, it would be better to show R-GECO in red and OCT4 in green to retain consistency with

9.
In Fig 2A, it would be better to show R-GECO in red and OCT4 in green to retain consistency with other images.
In Fig 3C, immunostaining in the upper images showed R-GECO in red while in the lower ones R-GECO is stained in green, which is difficult to follow.
It is interesting to see that puromycin enrichment of the iPSCs over 3 passages increased R-GECO1.0 expression in 15, did the author also tried puromycin enrichment in MYH7 ? MYH7 In the introduction, the authors mention that a doxycycline-inducible HOXA9-T2A-mScarlet cassette targeted in the locus of hiPSCs is used for modulating HOXA9 during AAVS1 hematopoietic differentiation. However, this iPSC reporter is also used for cardiomyocyte differentiation (Fig 4A). It is unclear whether there is specific reason why this reporter line is used instead of -CAG R-GECO iPSCs. AAVS1 It is better to present RTqPCR result using a ∆Ct method for Fig. 4 as fold changes are confusing especially in the context of transgene expression (the biological relevance of fold change is unclear unless one knows the base line transcript levels before gene activation).
In Fig 6 -Western Blot (Extended data), to test the peptide cleavage of P2A and T2A, the authors state that P2A is less efficient than T2A, as NpHR expression (following a P2A) was barely visible. The authors should provide positive control to exclude that the antibody for NpHR did not work.
Southern blots should be performed to make sure that clones tested in this study were targeted in the corrected locus, and silencing was not due to random integration.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
No competing interests were disclosed.

WT/MUT
We confirm that we have read this submission and believe that we have an appropriate level of expertise to state that we do not consider it to be of an acceptable scientific standard, for reasons outlined above.
Author Response 21 May 2020 , University of Nottingham, Nottingham, UK Jamie Bhagwan Dear Reviewer,

Thank you for your comments on our manuscript entitled Variable expression and silencing of CRISPR-Cas9 targeted transgenes identifies the AAVS1 locus as not an entirely safe harbour.
We have endeavoured to respond to your comments and suggestions as outlined below in bold font.
In this paper, the authors aim to investigate the effect of hypertrophic cardiomyopathy associated mutations and on Ca transients using hiPSCs-derived MYH7C ACTC1 cardiomyocytes. To this end, the authors established hiPSCs reporters by introducing a CAG promotor-controlled calcium indicator (R-GECO1.0) into the locus through CRISPR/Cas9 nickase-mediated AAVS1 genome editing. Among 24 clones, 20 were found to successfully express the transgene with variation from about 0 to 99.5%. Upon differentiation to cardiomyocytes, even clones with high expression levels demonstrated significant silencing of the transgene to 13.03% or 1.33%. By creating also an targeted doxycycline inducible HOXA9-T2A-mScarlet iPSCs reporter, the AAVS1 authors investigated the silencing over the time course during mesoderm lineage commitment. It was shown that both mRNA and protein levels of the transgenes were relatively high on day 2 and abruptly decreased from day 4 during both cardiomyocyte and hematopoietic differentiation. Despite the silencing of R-GECO1.0 in the locus, Ca live image by confocal laser line AAVS1 scan microscopy at the single cell level detected abnormal Ca transients in cardiomyocytes derived from hiPSCs reporters harboring or mutation compared to wildtype or MYH7C ACTC1 s isogenic control. In addition, this abnormality could be rescued by pharmacological inhibiting intercellular level of Na+ and Ca . This study shows that iPSC reporter line is of importance for disease modelling. This study presented an important issue in -targeted transgene expression in iPSCs and mesoderm AAVS1 lineage although the mechanisms for the silencing are not clear. The variable expression and silencing in mesoderm differentiation shown here are in line with previous reports that the AAVS1 is not a true safe harbor for cells differentiated to hematopoietic cells (e.g. PMID: 31773990) and endoderm (e.g. PMID: 26455413); and, hence, the findings are not fully novel. site when using the EF1α promoter that was overcome by replacing it with the CAG promoter. Indeed, this paper informed our choice of promoter for the RGECO targeting construct.
The penultimate sentence of the third paragraph in the Introduction now reads: "However, some reports of DNA methylation dampening transgene expression in both hPSC-derived hepatocytes (Ordovás et al., 2015) and iPSC-derived myeloid progenitors (Klatt et ., 2020) raise questions on whether a 'perfect' safe harbour locus exists" al The Discussion now reads: "Transgene silencing at the AAVS1 locus has previously been shown upon differentiation towards hepatocyte-like cells, with de novo methylation of the locus found to be responsible (Ordovás ., et al 2015). However, the aforementioned report claims that this silencing effect is restricted to endoderm differentiations. With the use of the -targeted HOXA9-T2A-mScarlet cell line, we AAVS1 show that in two mesoderm-specific differentiations towards cardiomyocytes and haematopoietic cells, silencing of expression occurs immediately after mesoderm specification and peak expression of and . This is in agreement with a recent report which demonstrated MIXL1 Brachyury differential transgene methylation in -targeted iPSC-derived myeloid cells (Klatt et al., AAVS1 2020)." And "Indeed, the CAG promoter appears to exhibit some insulation from methylation of transgenes at the locus (Ordovás ., 2015). In addition, a recent report elegantly demonstrated that AAVS1 et al contextual silencing at the locus of iPSC-derived myeloid precursors can occur with AAVS1 varying efficiency depending on the chosen promoter (Klatt et al., 2020)." The authors recognise and acknowledge that this phenomenon has been shown in endoderm and this is referred to in the manuscript.

The last paragraph of the Introduction contains the following sentence:
"This suggests that silencing at the locus is not limited to the endoderm lineage as AAVS1 previously described (Ordovás ., 2015). et al

The Discussion states that:
"Our results dispute the claims of robust and persistent transgene expression from AAVS1, and complements reports that show silencing at AAVS1 upon differentiation to endoderm lineage (Ordovás ., 2015)" et al

Specific comments:
The authors did not mention why for they compared wild-type ( ), heterozygous MYH7 ( ) and homozygous ( ) for the c. mutation, but no MYH7 MYH7 MYH7 homozygous mutant line for .

ACTC1
The authors apologise for not making this clearer in the text. The discrepancy between the range of genetically-engineered WT/WT WT/MUT MUT/MUT C9123T G301A 1.

5.
clearer in the text. The discrepancy between the range of genetically-engineered lines exhibiting the -or -mutations is due to the source of the cells and MYH7 ACTC1 their subsequent CRISPR/Cas9 targeting strategy. The -mutant hiPSC lines MYH7 were generated by targeting the WT allele of unrelated healthy cell lines or origin (Mosqueira , 2018), enabling the generation of heterozygous and homozygous et al. clones. In contrast, the isogenic set of lines was generated by genomic ACTC1 correction of the mutant allele of the starting hiPSC line derived from a heterozygous patient (using a donor vector containing the WT allele only) (Smith et ., 2018, Kondrashov ., 2018. As such, homozygous mutant clones could not al et al be generated by employing this strategy. Notwithstanding, the vast majority of HCM-causing mutations in the sarcomeric genes are heterozygous (Lopes et al, 2013), as homozygous mutations tend to be lethal. In particular, the mutations under study (p.R453C-βMHC and p.E99K-ACTC1) have never reported to be homozygous in patients, to the best of our knowledge. Therefore, the cellular models generated already encompass the patient-relevant genotypic status and as such provide an accurate characterization of HCM . Homozygous mutant in vitro lines were included to provide extra readout sensitivity for the phenotypic assays developed (as seen by more severe phenotypes in Figure 5).The first paragraph of Our overarching goal was to create two isogenic sets of the Results section now reads: " hiPSC lines in order to study Ca handling in the context of models of the disease in vitro HCM. One isogenic trio comprised lines that were originally wild-type ( ) and MYH7 then CRISPR Cas9 edited to generate heterozygous ( ) and homozygous ( MYH7 ) mutants for the c. mutation (Mosqueira ., 2018 Kondrashov ., 2018). et al In Fig 2A, the expression of R-GECO in clones (1/5 with about 50% cells) ACTC1 was lower than in clones. Is this chance or due to the mutation? ACTC1 We believe that this is unlikely to be due to the ACTC1 mutation but simply highlights the variability between different cell lines. Each cell line, and subsequently, each clone seemingly has varying levels of transgene silencing and extensive screening is therefore required, as we have performed. In Fig 2A, it would be better to show R-GECO in red and OCT4 in green to retain consistency with other images. Agreed. The images in Figure 2A for MYH7 clone 1 and clone 4 have now been pseudocoloured to match the MYH7 other images in the panel.
In Fig 3C, immunostaining in the upper images showed R-GECO in red while in the lower ones R-GECO is stained in green, which is difficult to follow.Agreed. To retain consistency, the upper images in Figure 3C have been pseudocoloured so that R-GECO is always represented as red. Figure 2D and all accompanying graphs have also been changed to aid consistency. It is interesting to see that puromycin enrichment of the iPSCs over 3 passages increased R-GECO1.0 expression in 15, did the author also tried puromycin MYH7 enrichment in ? MYH7 Our aim was to have a high enough percentage of cardiomyocytes expressing R-GECO so that confocal line scan analysis was technically feasible. The 15 clone was originally enriched due to the MYH7 extremely low level of cardiomyocyte R-GECO expression making it difficult to perform single cell analysis. In contrast, the 13.03% R-GECO expression in the

MUT/MUT
There are two arms to this manuscript. The authors aimed to determine the reliability of introducing a transgene containing a calcium indicator (R-GECO1.0) or a fluorescent reporter (HOXA9-T2A-mScarlet) into the AAVS1 locus using CRISPR-Cas9 and then subsequently investigate the effect of mutations in or on calcium handling.

MYH7 ACTC1
Initial experiments showed that although the AAVS1 could be targeted successfully in 20/24 clones, the expression of the transgene was highly variable. Furthermore, on differentiation of the more successfully transfected clones to cardiomyocytes, the transgene was silenced with almost complete removal of expression of the fluorescent reporter. A similar degree of silencing was observed during haematopoetic differentiation. Using the mScarlet reporter the authors investigated the timecourse of downregulation and found that with both cardiomyocyte and haematopoetic differentiation, increased transgene expression was observed at day 2 but then decreased substantially over time, at both the mRNA and protein level. Nevertheless, using puromycin selection, it was possible to isolate single cardiomyocytes expressing the R-GECO reporter and these were used to show that mutations in both and resulted in MYH7 ACTC1 abnormal calcium transient events which could be corrected in part using ranolazine and dantrolene. This is a valuable piece of work which demonstrates shortfalls in what was expected to be a fairly reliable transfection protocol and that the AAVS1 locus is not as foolproof a site for transfection as may have been thought. The addition of the calcium transient measurements is interesting in that it shows that something can be rescued from such a large body of work, but it does come as rather an afterthought in the manuscript.
I have a few minor comments to aid clarity: The introduction needs to be expanded somewhat. The paragraph on modulating HOXA9 is very brief and the rationale needs to be explained. In the results section this appears to be merely a way of monitoring transgene expression, but the short paragraph in the introduction implies some sort of mechanistic approach.
It is not entirely clear to me what the difference is between the data shown in Fig 2D and 3B apart from different clones.
In the section on HOXA9-T2A targeting it says in results that doxycycline was administered every 48 hours. This is not mentioned in the methods or the extended information. The authors state that transgene induction decreased after day 2. Was this reduction despite further addition of doxycycline?
In figure 5, the timing on the x axis is given in parts H and I but not in other plots. In some others there is a bar but this is not explained. Is the scale the same throughout? If so, some comment should be made as to why the wild type MYH7 cells have a slower beat rate than the ACTC1 wild type cells. The same formatting should be used on all the graphs.

Is the study design appropriate and is the work technically sound? Yes
Are sufficient details of methods and analysis provided to allow replication by others?

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Yes No competing interests were disclosed. Competing Interests: Reviewer Expertise: Cardiac physiology and cardiac stem cells for regeneration, disease phenotyping and drug testing.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Author Response 21 May 2020 , University of Nottingham, Nottingham, UK Jamie Bhagwan Dear Reviewer,

Thank you for your comments on our manuscript entitled Variable expression and silencing of CRISPR-Cas9 targeted transgenes identifies the AAVS1 locus as not an entirely safe harbour.
We have endeavoured to respond to your comments and suggestions as outlined below in bold font.
There are two arms to this manuscript. The authors aimed to determine the reliability of introducing a transgene containing a calcium indicator (R-GECO1.0) or a fluorescent reporter (HOXA9-T2A-mScarlet) into the AAVS1 locus using CRISPR-Cas9 and then subsequently investigate the effect of mutations in or on calcium handling.

MYH7 ACTC1
Initial experiments showed that although the AAVS1 could be targeted successfully in 20/24 clones, the expression of the transgene was highly variable. Furthermore, on differentiation of the more successfully transfected clones to cardiomyocytes, the transgene was silenced with almost complete removal of expression of the fluorescent reporter. A similar degree of silencing was observed during haematopoetic differentiation. Using the mScarlet reporter the authors investigated the timecourse of downregulation and found that with both cardiomyocyte and haematopoetic differentiation, increased transgene expression was observed at day 2 but then decreased substantially over time, at both the mRNA and protein level. Nevertheless, using puromycin selection, it was possible to isolate single cardiomyocytes expressing the R-GECO reporter and these were used to show that mutations in both and MYH7 resulted in abnormal calcium transient events which could be corrected in part using ACTC1 ranolazine and dantrolene. This is a valuable piece of work which demonstrates shortfalls in what was expected to be a fairly reliable transfection protocol and that the AAVS1 locus is not as foolproof a site for transfection as reliable transfection protocol and that the AAVS1 locus is not as foolproof a site for transfection as may have been thought. The addition of the calcium transient measurements is interesting in that it shows that something can be rescued from such a large body of work, but it does come as rather an afterthought in the manuscript.
I have a few minor comments to aid clarity: The introduction needs to be expanded somewhat. The paragraph on modulating HOXA9 is very brief and the rationale needs to be explained. In the results section this appears to be merely a way of monitoring transgene expression, but the short paragraph in the introduction implies some sort of mechanistic approach. For the purposes of this paper the HOXA9 model is simply used a tool to monitor transgene expression. However, its overall purpose in the context of haematopoiesis is now described in the introduction for completeness. The penultimate paragraph of the Introduction now reads: "In addition, CRISPR Cas9 targeting of the AAVS1 locus was used to target a doxycycline-inducible HOXA9-T2A-mScarlet cassette into hiPSCs. HOXA9 is a transcription factor regulated spatio-temporally during haematopoietic or cardiac development (Behrens ., 2013) and the aim was to examine if controlled supplemental et al expression of HOXA9 resulted in more efficient production of mature cells." It is not entirely clear to me what the difference is between the data shown in Fig 2D and 3B apart from different clones. Yes, we concede that these experiments are similar. However, Fig 2D is  In the section on HOXA9-T2A targeting it says in results that doxycycline was administered every 48 hours. This is not mentioned in the methods or the extended information. The authors state that transgene induction decreased after day 2. Was this reduction despite further addition of doxycycline? Yes, this reduction did occur despite further addition of doxycycline every 2 days. In addition to text in the results and the figure legend for Figure 3, the methods have now been clarified with a sentence in the 'Live imaging of mScarlet' section which now reads: "HOXA9 and mScarlet expression was induced with the addition of 1 µg/ml doxycline every 48 hours." In figure 5, the timing on the x axis is given in parts H and I but not in other plots. In some others there is a bar but this is not explained. Is the scale the same throughout? If so, some comment should be made as to why the wild type MYH7 cells have a slower beat rate than the ACTC1 wild type cells. The same formatting should be used on all the graphs. The authors apologise for the error. The x axis scale bar relates to time (5 seconds) and the y-axis scale bar relates to the length of the laser line drawn across the cell to perform the confocal line scan. All panels now contain both scale bars and the legend for Figure 5 has been corrected to include this information. The methods have also been clarified to state that these line scans are performed on spontaneously beating cardiomyocytes, hence their slightly varied beat rate. The end of the first paragraph of the 'confocal analysis' methods sections now reads: end of the first paragraph of the 'confocal analysis' methods sections now reads: Line-scan images were taken every 75 milliseconds, with a pixel dwell time of 4.12 µsec, " for a total of 4000 cycles resulting in a five minute scan. CMs were kept at 37°C and 5% CO 2 and allowed to spontaneously beat throughout data acquisition." Differences in beat rate and action potential duration between healthy hiPSC-CMs are common, as previously reviewed (Sala et al., 2017). This further advocates the need for isogenic lines in order to ensure that the impact of the mutation studied is accurately investigated. The fourth sentence of the section entitled "AAVS1-targeted R-GECO1.0 expressing clones as a tool for in vitro disease modelling and drug screening" now reads: "Despite some expected variability in spontaneous beat rate between wild-type hiPSC-CMs ( Figure 5A and 5E No competing interests were disclosed.

Competing Interests:
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias You can publish traditional articles, null/negative results, case reports, data notes and more The peer review process is transparent and collaborative Your article is indexed in PubMed after passing peer review Dedicated customer support at every stage For pre-submission enquiries, contact research@f1000.com 2+