A novel genotyping method to determine copy number in a mouse line commonly used for inducible transgene expression in brain and spinal cord [version 1; peer review: 3 approved with reservations]

Background: The NEFH-tTA mouse has the human neurofilament heavy polypeptide promoter directing tetracycline-controlled transactivator protein (tTa) expression to the brain and spinal cord, allowing tissue-specific and doxycycline-suppressible expression of a target gene. Current genotyping protocols can only differentiate between wild-type and transgenic animals. Being able to differentiate between hemizygous and homozygous animals would be beneficial in experiment planning and reducing animal numbers. Methods: We have identified the insertion site of the NEFH-tTA transgene via targeted locus amplification and next-generation sequencing. This was then used to design a multiplex PCR assay to distinguish between hemizygous and homozygous mice. Results: The NEFH-tTA transgene is located on chromosome 12. Our genotyping method can identify hemizygous and homozygous mice. Conclusions: The NEFH-tTA transgenic mouse line is a useful tool for studying a wide range of diseases including frontotemporal dementia and motor neuron disease, as well as other neurodevelopmental, neuromuscular or neurodegenerative disorders. We have designed and utilised a novel genotyping assay to distinguish between hemizygous and homozygous mice, involving a simple PCR assay. This is easily adaptable to a laboratory-specific protocol or machine, and will allow refinement of breeding strategies and a reduction in the number of animals that cannot be used in experiments.


Introduction
The NEFH-tTA mouse has the human neurofilament heavy polypeptide promoter directing tetracycline-controlled transactivator protein (tTA) expression to large-calibre axons of the brain and spinal cord, allowing tissue-specific and doxycyclinesuppressible expression of a target gene 1 . The line was generated via pronuclear injection of a plasmid with the human NEFH promoter isolated from BAC clone (CHORI: RP11-91J21) to drive expression of the tetracycline transactivator (tTA) gene, which randomly integrated into the genome 1 . While a genotyping protocol is available to distinguish between wild-type and transgenic animals 1 , ideally a colony of homozygous animals would be maintained for experimental breeding with another transgenic line with the gene of interest under the control of a tetracycline response element (TRE). All offspring of these breeds would therefore be NEFH-tTA +/-(which has been shown to be a sufficient driver of expression 1 ); this would allow use of littermate controls and reduce required breeding numbers as encouraged by the ARRIVE guidelines 2 . In order to address this, we have identified the insertion site of the NEFH-tTA transgene, designed a novel PCR genotyping strategy, and demonstrated its reliability in a colony of NEFH-tTA mice.

Ethical statement
All animal work was approved following local ethical review by the University of Manchester Animal Welfare and Ethical Review Board and performed under Home Office project license 70/8903 and in accordance with the Home Office (Animals) Scientific Procedures Act (1986). All efforts were made to ameliorate harm to animals; mice were housed and maintained according to standard practices in the University of Manchester Biological Services Facility (detailed below) and no licenced procedures were performed.
Maintenance of transgenic mice B6;C3-Tg(NEFH-tTA)8Vle/J mice (JAX stock #025397) were obtained from Jackson Laboratories (Bar Harbor, ME, USA) and bred with C57BL6/J mice to produce a colony of hemizygous NEFH-tTA animals (13 transgenic mice were crossed with 12 wild-type in total, as this number was sufficient to create a new colony of backcrossed mice). All animal work was carried out in the University of Manchester Biological Services Facility. Mice had free access to food and water and were housed under light-humidity-and temperature-controlled conditions: ambient temperatures of 21°C (±2°C), humidity of 40-50%, 12 h light cycle, ad libitum access to water and standard rodent chow. Animals were housed up to five per cage with environmental enrichment. For breeding, mice were housed as either breeding pairs or trios of two female and one male, and pups housed with parents until weaning (approximately four weeks post-birth).
At weaning, animals were restrained by scruffing and ear biopsies were taken using a standard ear puncher and genotyped as described 1 . Briefly, tissue was disrupted in proteinase K for one hour at 57°C; reactions containing 3μL of diluted DNA, 10μL PrimeSTAR HS Premix (Takara, Clontech #R040A) and 0.5μM each of forward and reverse transgene primers and internal positive control primers (Primers 1-4, Table 1) in a total volume of 20μL were amplified in a thermal cycler (ThermoFisher SimpliAmp TM A24812; Table 2). PCR products were visualised on a 2% agarose gel with HyperLadder IV (Bioline) to determine product size.

Identification of the transgene insertion point
A combination of targeted locus amplification (TLA) and next-generation sequencing was used to identify the insertion point of the NEFH-tTA transgene. This was performed by Cergentis B.V (Utrecht, The Netherlands). One mouse spleen was dissected from a six-week old NEFH-tTA +/mouse (culled by Schedule 1 CO 2 inhalation) and cells isolated for TLA sample prep. Spleen tissue was disrupted by gently pushing through a 40μm mesh and cells collected in PBS with 10% foetal calf serum (Gibco #10500064). Cells were pelleted by centrifugation at 250g for ten minutes at room temperature, resuspended in lysis buffer (0.15M NH 4 CL, 0.01M KHCO 3 , 0.0002M EDTA) and incubated at room temperature for five minutes before additional centrifugation at 250g for ten minutes. Cells were resuspended in PBS with 10% foetal calf serum (Gibco #10500064), stored at -80°C and shipped to Cergentis on dry ice. Two primer sets (Table 3) were used in individual TLA amplifications. PCR products were purified and library prepped using the Illumina NexteraXT protocol and sequenced on an Illumina sequencer. Reads were mapped using BWA-SW, which is a Smith-Waterman alignment tool. This allows partial mapping, which is optimally suited for identifying break-spanning reads. The mouse genome version mm10 (GenBank assembly accession: GCA_000001635.2) was used for mapping.
Genotyping of offspring from hemizygous intercross matings Hemizygous mice were bred together (total 25 animals; the number of breeding animals was chosen based on expected number of homozygous offspring and numbers required to set up a large colony to breed with a different transgenic line 3 ) and ear biopsies taken from all offspring at weaning (n=50) as described above. Primers were designed using Primer3 (web version 4.1.0) to produce a product spanning the insertion site on chromosome 12 (Primers 5-6, Table 1) and PCR reactions were set up as described under Maintenance of transgenic mice. REDExtract-N-Amp™ 2X PCR Reaction Mix (Sigma-Aldrich, #R4775) was used instead of PrimeStar.

Statistical analysis
Outcome vs Expected for genotype was performed using GraphPad Prism version 8.00 for Mac (GraphPad Software, La Jolla California USA, www.graphpad.com).

Results
The NEFH-tTA transgene is located on chromosome 12 Using TLA highest coverage is observed on the sequences directly surrounding the location of the primer set. Here, high coverage is observed on chromosome 12, outlined in red in Figure 1A, indicating the transgene (TG) has integrated in chromosome 12 4 .
From locus-wide coverage ( Figure 1B) it was concluded that the TG has integrated in mouse chr12:6917896-chr12:6917912. Reads marking the TG integration are identified in Table 4. The 11 bases between chr12:6917896-chr12:6917912 have been deleted following the integration event. According to the reference sequence (mm10) there are no genes annotated in this region. Complete coverage is observed across the whole TG sequence from TG:68-18543 ( Figure 1C). At the edges of the coverage, fusion reads have been found, connecting these ends, but the TG: 1 -68 and TG 18544 -end regions are not integrated into the mouse genome (Table 4). These regions contained sequences for bacterial DNA replication and therefore are not required for transgenic expression in the mouse.
Next to the head-to-tail fusion as reported above, four other TG-TG fusions were found within the TG sequence (Table 4).
While an exact copy number cannot be determined using TLA, an estimation can be made based on the number of integration sites, number of fusion reads and the ratio of coverage on the TG and genome integration site. Here, one integration site was found and a total of five TG-TG fusions were found. The coverage at the TG was significantly higher compared to the genome. These facts together indicate that >5 copies of the TG have integrated.
Design of a PCR assay to distinguish between hemizygous and homozygous transgenic mice We designed primers to amplify a wild-type PCR product of 488bp from chromosome 12, which would be disrupted by insertion of the transgenic allele ( Figure 2A). A multiplex PCR reaction including the original primers to the transgene 1 ( Table 1) was carried out on samples from a hemizygous intercross and proved able to distinguish between wild-type (488bp band only), hemizygous (both bands) and homozygous (150bp band only) ( Figure 2B    Homozygous NEFH-tTA mice are observed at the same rate as expected After genotyping 50 offspring of hemizygous intercrosses we observed no significant difference in the number of Nefh-tTa +/+ mice compared to the expected ratio (Figure 3, chi-square=2.64, p<0.2671 4 ), confirming our novel genotyping assay performs as expected. Homozygous mice grew to adulthood and did not exhibit any overt phenotype.

Discussion/conclusion
The NEFH-tTA transgenic mouse line is a useful tool for studying a wide range of diseases including frontotemporal dementia and motor neurone disease, as well as other neurodevelopmental, neuromuscular or neurodegenerative disorders.
Here we have shown that multiple copies of the transgene have inserted into chromosome 12. Furthermore, we have designed and utilised a novel genotyping assay to distinguish between hemizygous and homozygous mice, involving a simple PCR assay. This is easily adaptable to a laboratory-specific protocol or machine and will allow researchers using this line to refine their breeding strategies and reduce the number of animals that cannot be used in experiments.

Data availability
Underlying data Repository: "A novel genotyping method to determine copy number in a mouse line commonly used for inducible transgene expression in brain and spinal cord -Raw data/analysis" DOI: https://doi.org/10.6084/m9.figshare.12982220.

Sandy S. Pineda
Brain and Mind Centre, University of Sydney, Sydney, Australia In the manuscript entitled: "A novel genotyping method to determine copy number in a mouse line commonly used for inducible transgene expression in brain and spinal cord", by Hobbs and colleagues, they present the use of TLA (Targeted locus amplification) analysis to identify the integration site of the NEFH-tTA transgene in B6;C3-Tg(NEFHTTA)8Vle/J mice, and develop a simple genotyping assay to distinguish Heterozygous from Homozygous animals for the transgene. The use of TLA allowed the mapping of the random integration site to ch12qA1.1 in the mouse genome. The study is quite simple, but clear in their objective of providing researchers a new genotyping protocol that would facilitate the handling of the mouse colony and planning of future experiments.
However, there is some minor changes that would improve the manuscript: Could the authors include an extended materials and methods sections to include a summary of the TLA method and the bioinformatics commands used? A summary figure, could also be included to summarise the method.

○
Since TLA was used to get the exact location of the integration, a supplementary FASTA file with the entire sequence including the junctions might help the reader to better understand the insertion site.
○ I would have suggested to do a high-resolution copy number analysis, as an extra step. I find the use of copy number in the title somewhat misleading, since the only confirmation of the copy number comes indirectly from the TLA analysis.
○ I do not see the reason for figure 3, it would be best to be removed or to be merged to figure 2. If kept, you will also need to provide more details from the statistical analysis done, maybe adding the table right next to it.

Is the work clearly and accurately presented and does it cite the current literature?
Yes

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Yes
Methods: How many times were the transgenic mice backcrossed to the C57BL/6J mice before they were used in this study?

Results:
In the first paragraph, transgene is abbreviated as TG, but the word, transgene is already appeared in the introduction.

○
The figure 3 is better to described as a table, showing number and ratio in each genotype.

Is the work clearly and accurately presented and does it cite the current literature? Yes
Is the study design appropriate and is the work technically sound? Yes

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes In this manuscript from Eleanor Hobbs et al., the authors report the use of targeted locus amplification to identify the integration site of the NEFH-tTA transgene in B6;C3-Tg(NEFH-TTA)8Vle/J mice, which have been used in disease models for neuron-specific expression of doxycycline-regulatable cassettes in the brain and spinal cord. Having mapped the NEFH-tTA integration to ch12qA1.1, the authors proceed to develop a simple PCR assay to discriminate mice hemizygous and homozygous for the transgene.
The study is well performed and lucidly communicated. The reported PCR assay is anticipated to be of use to investigators breeding B6;C3-Tg(NEFH-TTA)8Vle/J mice, facilitating crossing to generate homozygotes. Some minor amendments would enhance the manuscript and these are discussed below: The reference to 'copy number' in the title is misleading since the reported genotyping method actually assesses zygosity of the NEFH-tTA locus rather than NEFH-tTA transgene copy number per se.

○
The amplicon reads from targeted locus amplification are aligned to GRCm38 despite an updated assembly -GRCm39 -now being available. While it is unnecessary to repeat the bioinformatic analyses, it would be useful to add the NEFH-tTA 5' and 3' integration coordinates in terms of the latest Mm genome build.
○ It would be helpful to provide a FASTA reference file for the NEFH-tTA sequence so that readers can more easily interpret the junction and fusion sites identified in Table 4.

○
The manuscript indicates that similar targeted locus amplification findings were found for the second primer set and references the online 'Underlying data' files ( Fig. 1A legend). However, data showing the alignment of NGS reads by chromosome for primer set 2 are not actually in the supplemental files. This would be best resolved by adding the missing data or, less optimally, by removing reference to the second primer pair or clarifying what is meant by "similar results".

○
The molecular biology aspects of the targeted locus amplification method are not described. Moreover, the Illumina instrument used should be identified and the 'bwa bwasw' alignment parameters should be given.

○
The second paragraph of the Results section indicates that the NEFH-tTA integration coordinates were identified from locus-wide coverage data (Fig. 1B), but presumably, it was actually the sequence of the ch12/TG junction reads that identified the integration site.

○
The treatment of data pertaining to the NEFH-tTA integration copy number in the ch12 locus is not entirely clear (results paragraphs 2 and 3). Is it possible that any of the TG-TG fusion amplicons were generated secondary to the fragmentation and re-ligation procedure used in targeted locus amplification? Without orthogonal sequencing of the locus, the ○ reader is left unclear as to the proposed arrangement of the putative multiple transgene copies at the integration site. Left unresolved, the corresponding statement in the conclusion ("…we have shown that multiple copies of the transgene have inserted…") may be viewed as excessively definitive. Figure 3 is superfluous and could be replaced by the Chi-square dependency table.
Is the work clearly and accurately presented and does it cite the current literature? Yes

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Partly

Are the conclusions drawn adequately supported by the results? Partly
Competing Interests: No competing interests were disclosed.
We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.