A bovine CD18 signal peptide variant with increased binding activity to Mannheimia hemolytica leukotoxin

Background: Mannheimia haemolytica is the major bacterial infectious agent of bovine respiratory disease complex and causes severe morbidity and mortality during lung infections. M. haemolytica secretes a protein leukotoxin (Lkt) that binds to the CD18 receptor on leukocytes, initiates lysis, induces inflammation, and causes acute fibrinous bronchopneumonia. Lkt binds the 22-amino acid CD18 signal peptide domain, which remains uncleaved in ruminant species. Our aim was to identify missense variation in the bovine CD18 signal peptide and measure the effects on Lkt binding. Methods: Missense variants in the integrin beta 2 gene ( ITGB2) encoding CD18 were identified by whole genome sequencing of 96 cattle from 19 breeds, and targeted Sanger sequencing of 1238 cattle from 46 breeds. The ability of different CD18 signal peptide variants to bind Lkt was evaluated by preincubating the toxin with synthetic peptides and applying the mixture to susceptible bovine cell cultures in cytotoxicity-blocking assays. Results: We identified 14 missense variants encoded on 15 predicted haplotypes, including a rare signal peptide variant with a cysteine at position 5 (C 5) instead of arginine (R 5). Preincubating Lkt with synthetic signal peptides with C 5 blocked cytotoxicity significantly better than those with R 5. The most potent synthetic peptide (C 5PQLLLLAGLLA) had 30-fold more binding activity compared to that with R 5. Conclusions: The results suggest that missense variants in the CD18 signal peptide affect Lkt binding, and animals carrying the C 5 allele may be more susceptible to the effects of Lkt. The results also identify a potent class of non-antibiotic Lkt inhibitors that could potentially protect cattle from cytotoxic effects during acute lung infections.


Introduction
Mannheimia haemolytica is the major bacteria associated with bovine respiratory disease, a heterogeneous complex of highly infectious pathogens that are the primary cause of morbidity, mortality, and economic loss affecting beef and dairy cattle industries 1,2 . M. haemolytica is a commensal bacterium found in tonsillar crypts and the upper respiratory tracts of healthy cattle 3,4 . Exposure to environmental stresses or co-infection with other viral or bacterial pathogens can impair host defenses allowing M. haemolytica to proliferate and colonize the lungs where infection causes acute fibrinonecrotic pleuropneumonia 2,5,6 . This bacterium expresses a variety of virulence factors that contribute to disease pathogenesis in the lungs. However, leukotoxin (Lkt) is the primary virulence factor contributing to the clinical signs and severe lung damage observed following infection 7,8 . Within hours of bacterial colonization of the lung, large numbers of polymorphonuclear leukocytes (PMN) infiltrate the airways 2,9 . Lkt binds to the bovine CD18 subunit of the heterodimeric integrins on the surface of PMN causing cell lysis and the release of pro-inflammatory cytokines, proteolytic enzymes, and reactive oxygen intermediates that intensify local inflammation 6,10-13 . Experimental depletion of PMN prior to infection 14 , or infection with a Lkt-deletion mutant of M. haemolytica 7,8 , results in decreased morbidity and reduced lung lesions in calves. Thus, the interaction between the toxin and its receptor is critical to the pathogenesis of M. haemolytica infection and is a potential intervention point for the prevention of disease.
A 13-amino acid sequence in the CD18 signal peptide has been identified as the site which binds to bacterial Lkt 10 . The 22-amino acids that comprise the CD18 signal peptide remain uncleaved in leukocytes of ruminant species due to a conserved cleavageinhibiting glutamine residue at position 18 (Q 18 ) of the propeptide 10 .
In non-ruminant species, such as human and murine, leukocytes are naturally resistant to Lkt because their CD18 signal peptides undergo cleavage due to a glycine residue at position 18 (G 18 ). However, when murine cell lines were transfected with bovine ITGB2, the gene which encodes CD18, they became susceptible to Lkt. When site-directed mutagenesis of ITGB2 was used in the same murine cell lines to change the bovine Q 18 residue to G 18 , the bovine CD18 signal peptide was cleaved and the murine cells once again became resistant to Lkt-induced lysis 10 . This strategy was taken further with gene-editing, showing that leukocytes isolated from a cloned bovine fetus, homozygous for CD18 G 18 , were unaffected by Lkt exposure because the signal peptide was cleaved and unavailable for Lkt binding 15 . Thus, retention of the ruminant CD18 signal peptide appears to be the cause of Lkt sensitivity in leukocytes.
Naturally occurring CD18 amino acid sequence variation can also interfere with Lkt cytotoxicity. In Holstein dairy cattle, a CD18 substitution of glycine for aspartate at polypeptide position 128 in the extracellular I-like domain of CD18 causes bovine leukocyte adhesion deficiency (BLAD) in homozygous animals 16,17 . These calves do not express functional CD18 on the surface of their leukocytes and have significantly reduced sensitivity to Lkt compared to control calves 18,19 . We hypothesized that other variation in the CD18 polypeptide sequence, if it exists, may alter the Lkt-CD18 binding interaction or cell signaling and result in differences in lymphocyte sensitivity to M. haemolytica Lkt. Thus, the goals of this study were to identify CD18 protein variants encoded by ITGB2 in U.S. cattle breeds, and evaluate the effects of signal peptide variants on Lkt binding. We report the identification of 15 predicted protein variants, including one with enhanced Lkt binding.

Ethics statement
All animal procedures were reviewed and approved by the U.S. Department of Agriculture, Agricultural Research Service, U.S. Meat Animal Research Center (USMARC) Institutional Animal Care and Use Committee (IACUC project number 2.2).
Panels of cattle DNA used for missense mutation discovery Two panels of DNAs were used to determine ITGB2 genotypes from U.S. cattle. The first was a previously described panel of 96 unrelated beef cattle from 19 popular U.S. beef breeds that had already been characterized by whole-genome sequencing 20 . This identified predicted coding changes throughout the ITGB2 gene. The second panel included a non-overlapping set of 1142 unrelated cattle from 46 breeds, on which targeted Sanger sequencing was performed to identify any predicted coding changes in the signal peptide region and the region containing the D128G variant causing BLAD (ITGB2 exons 2, 3, and 5). Briefly, the first panel of 96 beef cattle (USMARC Beef Cattle Diversity Panel version 2.9 [MBCDPv2.9]) was based on commerciallyavailable purebred registered sires. Pedigrees were obtained from leading suppliers of U.S. beef cattle semen and analyzed to identify unrelated individuals for inclusion. The number of sires representing each breed (four, five, or six) was based on their numbers of registered progeny circa 2000: Angus (n = 6), Hereford (n = 6), Charolais (n = 6), Simmental (n = 6), Red Angus (n = 6), Limousin (n = 6), Gelbvieh (n = 6), Brangus (n = 5), Beefmaster (n = 5), Salers (n = 5), Shorthorn (n = 5), Maine-Anjou (n = 5), Brahman (n = 5), Chianina (n = 4), Texas Longhorn (n = 4), Santa Gertrudis (n = 4), Braunvieh (n = 4), Corriente (n = 4), and Tarentaise (n = 4). On the basis of the number of registered progeny, the breeds were estimated to represent greater than 99% of the germplasm used in the US beef cattle industry, contain more than 187 unshared haploid genomes, and allow a 95% probability of detecting any allele with a frequency greater than 0.016 21 .
The second panel of 1142 cattle consisted of samples from male and female registered purebred cattle with diverse pedigrees from 46 breeds. Samples were from semen, blood, or hair follicles, depending on gender and availability 22 . Where possible, animals within breed were chosen so they did not share parents or grandparents, and none were closely related to the 96 sires in the  DNA from commercial bull semen was extracted similarly, with slight modification 23 . Briefly, three 0.5 ml straws of commercial semen from a single animal were pooled, and the cells were collected by centrifugation for 5 min at 1000 x g. The cell pellet was washed three times in 1 ml of a wash solution (TE with 100 mM NaCl, TNE) and suspended in 1 ml of the same solution with 1% wt/vol sodium dodecyl sulfate, 1 mg proteinase K (Sigma-Aldrich), and 40 mM dithiothreitol (DTT  The sense and antisense primer sequences for exons 2 and 3 were  5'-AGG-GAG-ACT-GAC-CTG-TGT-G-3' and 5'-CTG-GGA-AGC-AGA-GTG-ATA-GT-3', respectively (USMARC primer no. 89878 and 89880). The sense and antisense primer sequences for exon 5 were 5'-AGA-GAG-ATC-CAG-GTA-GAA-CTG-3' and 5'-GTG-CAG-AGG-TGC-AGA-GGT-G-3', respectively (USMARC primer no. 89887 and 89889 Whole genome sequencing of BL3 cells Whole genome sequencing of BL3 cells (a bovine lymphoma cell line; kindly provided by Dr. Subramaniam Srikumaran) was accomplished with methods as described elsewhere 20 . Briefly, genomic DNA was used to make a 500 bp paired-end library and sequenced with a massively parallel sequencing machine and high-output kits (NextSeq500, two by 150 paired-end reads, Illumina, San Diego, CA, USA) until a minimum of 40 GB of data with greater than Q20 quality, was collected. After sequencing, the raw reads were filtered to remove adaptor sequences, contaminating dimer sequences, and low-quality reads. The DNA sequence alignment process was similar to that previously reported 20 . FASTQ files were aggregated for each sample and DNA sequences were aligned individually to the bovine reference assembly UMD3.1 28 with the Burrows-Wheeler aligner (BWA) aln algorithm version 0.7.12 29 , then merged and collated with bwa sampe. The resulting sequence alignment map (SAM) files were converted to binary alignment map (BAM) files, and subsequently sorted via SAMtools version 0.1.18 30 . Potential PCR duplicates were marked in the BAM files using the Genome Analysis Toolkit (GATK) version 1.5-32-g2761da9 31 . Regions in the mapped dataset that would benefit from realignment due to small indels were identified with the GATK module RealignerTar-getCreator, and realigned using the module IndelRealigner. The BAM files produced at each of these steps were indexed using SAMtools. The resulting indexed BAM files were made available via the USMARC WGS browser and the raw reads for the BL3 cell line were deposited at NCBI BioProject PRJNA325058, BioSample number SAMN05217649. Mapped datasets for each sample were individually genotyped with the GATK UnifiedGenotyper with arguments "--alleles" set to the VCF file (Extended Data File S1) 32 , "--genotyping_mode" set to "GEN-OTYPE_GIVEN_ALLELES", and "--output_mode" set to "EMIT_ALL_SITES". Lastly, some SNP variants were identified manually by inspecting the sequence with IGV software version 2.1.28 33,34 (described in the Methods section entitled 'Identifying protein variants encoded by ITGB2'). In these cases, read depth, allele count, allele position in the read, and quality score were considered when the manual genotype determination was made.
Identifying predicted protein variants encoded by bovine ITGB2 Aligned WGS data from 96 sires of MBCDPv2.9 were visually analyzed in the ITGB2 coding region to identify potential CD18 protein variants. Viewing the aligned sequences and detecting variants was accomplished with the IGV software and a browser developed for this purpose. Briefly, public internet sites at the USDA, ARS, USMARC were used in combination with open source software installed on a laptop computer and recorded manually in a spreadsheet as previously described 20 . A Java Runtime Environment (Oracle Corporation, Redwood Shores, CA, USA) was first installed on the computer. When links to the data were selected by the user, IGV software 33,34 was loaded from a third-party site (University of Louisville, Louisville, KY, USA) and aligned DNA sequence reads were displayed in the context of the bovine UMD3.1 reference genome assembly. For viewing ITGB2 gene variants, WGS from a set of eight animals of different breeds was loaded, and the IGV browser was directed to the appropriate genome region by entering "ITGB2" in the search field. The IGV zoom function was used to view the first exon at nucleotide resolution with the [show translation] option selected in IGV. An example of the alignment view for ITGB2 codon 27 with eight animals is shown in Extended data, Figure S1 35 .
The exon sequences were visually scanned for polymorphisms predicted to alter amino acid sequences, including missense, nonsense, frameshift, splice site, and insertion/deletion mutations. An in silico analysis of other potential splice-affecting variants was not performed as there are no consensus guidelines on the selection of programs or protocols to interpret the predicted results in cattle. Once identified, the variant nucleotide position was viewed and recorded for all 96 animals. The codons affected by SNP alleles were translated into their corresponding amino acids with IGV, codon tables, and knowledge of the CD18 protein sequence (NP_786975). Haplotype phases of predicted polypeptide variants were unambiguously assigned with homozygous individuals, and those with only one variant amino acid. A maximum parsimony phylogenetic tree was manually constructed from the unambiguously phased protein variants and used to infer phases in the remaining variants with maximum parsimony assumptions.
MALDI-TOF MS genotyping of 14 ITGB2 missense mutations A single multiplex assay was designed for the 14 ITGB2 missense SNPs with software provided by the manufacturer (Agena Biosciences, San Diego, CA, USA). The oligonucleotide sequences and assay conditions are provided in Table S1. After design and validation with bovine control DNAs for each SNP, the DNA from the 96 bulls in the MBCDPv2.9 diversity panel were tested in a blinded experiment. Assay design and genotyping was performed at GeneSeek (Lincoln, NE, USA) with the MassARRAY platform and iPLEX Gold chemistry according to the manufacturer's instructions (Agena Biosciences).
Lkt preparation, gel electophoresis, and protein immunoblotting M. haemolytica strains for toxin production were isolated from cattle with severe fibrinous pleuropneumonia in feedlot environments and had complete closed whole genome sequence assemblies available at NCBI. M. haemolytica strain 89010807 N serotype A1 (lktA+) has been widely used for Lkt production for in vitro cytotoxicity assays and has the added advantage of being the parent strain of an isogenic leukotoxin deletion mutant (lktA-) 36,37 . A second strain, M. haemolytica strain USDA-ARS-USMARC-183 serotype A1, was isolated from an animal that was part of a high-mortality respiratory disease outbreak in a Kansas feedlot in 1991 and represents the first strain with a complete closed genome assembly 38 ; however, it had not previously been used in in vitro assays.
Isolates were maintained on Brain Heart Infusion (BHI) agar (Sigma-Aldrich) frozen stocks were kept in BHI broth with 20% glycerol at -80°C. RPMI 1640 medium (without Phenol Red and L-glutamine, Sigma-Aldrich) and semi-defined medium 2 (SDM2) were used for batch culture production of Lkt. SDM2 is an amino acid-limited culture medium supplemented with cysteine, glutamine, ferric iron, and manganese and was previously shown to greatly improve Lkt production in aerobic batch culture 39 . For Lkt production, a single, 24-hour colony isolate from BHI agar was inoculated into 5 ml BHI broth in a 10 ml culture tube and incubated overnight at 37°C in 5% CO 2 without shaking. The following morning, 1 ml of BHI liquid culture was inoculated into 100 ml of fresh culture medium in a 300 ml Delong-style Erlenmeyer flask with baffles (Corning, Inc., Corning, NY, USA) and incubated at 37°C, 250 rpm, in 5% CO 2 . At intervals, 14 ml samples were removed and centrifuged at 13,100 x g for 10 min at 4°C. The clarified supernatant was decanted, flash frozen in liquid nitrogen and stored at -80°C until use.
Lkt and other proteins secreted into the growth media by M. haemolytica were analyzed by SDS-PAGE. Clarified supernatants were precipitated with one volume of acetone on ice for 30 min, followed by centrifugation at 20,800 x g for 5 minutes at room temperature, and air dried 30 min. Sedimented proteins were dissolved in a commercial sample buffer with lithium dodecyl sulfate and dithiothreitol and used per the manufacture instructions (Thermo Fisher Scientific). Samples were heated to 70°C for 10 minutes and loaded on 4-12% precast polyacrylamide Bis-Tris gels (Thermo Fisher Scientific) at run 160 volts for approximately 35 min in 2-[N-morpholino]ethanesulfonic acid (MES) SDS running buffer (Thermo Fisher Scientific). Proteins sorted by SDS PAGE were stained with coomassie-dye reagent (GelCode Blue, Thermo Fisher Scientific) and destained in water. Prestained protein standards (Novex Sharp, Thermo Fisher Scientific) were used to estimate molecular weights of M. haemolytica proteins from clarified supernatants. ImageJ software (version 1.52A) was used to estimate relative proportions of protein bands on coomassie-stained page gel 40 .
For protein immunoblots (western blots), proteins from SDS-PAGE gels were electrophoretically transferred to 0.2 µm polyvinylidene difluoride membranes (PVDF, Invitrolon, Thermo Fisher Scientific) with a Mini Blot Module (Thermo Fisher Scientific) per the manufacturer's instructions. PVDF membranes were wetted in 100% methanol prior to equilibrating in transfer buffer (Bolt transfer buffer, Thermo Fisher Scientific) and assembling in the blotting apparatus. Proteins were transferred to membranes for 60 min at a constant voltage of 20 V. Blots were removed from the apparatus and incubated in a blocking reagent (StartingBlock(PBS), Thermo Fisher Scientific), for 60 min at room temperature with gentle agitation. This solution was replaced with a fresh blocking reagent that had a rabbit polyclonal Ltk antibody at a concentration of 1 µg/ml (M. haemolytica Lkt Antibody, catalog number LS-C369014, LifeSpan, BioSciences, Inc, Seattle, WA, USA) and incubated as above for 60 min. The primary Lkt antibody was washed three times in for 10 min each in 25 mM Tris, 0.15 M NaCl, 0.05% Tween-20, pH 7.5 (TBS Tween-20, Thermo Fisher Scientific). After washing, a goat antirabbit IgG antibody conjugated to horseradish peroxidase (HRP) (catalog number ab97040, Abcam, Cambridge, MA, USA) was added at a concentration of 1 µg/ml in blocking reagent and incubated for 60 min at room temperature with gentle agitation. This secondary anti-rabbit antibody was washed three times for 10 min each in TBS Tween-20 prior to detection with chemiluminescent substrate (Pierce ECL Western Blotting substrate, Thermo Fisher Scientific). The immunoblot was incubated in chemiluminescent substrate for 1 min and imaged for approximately 5 min (ChemiDoc, Bio-Rad Laboratories, Inc. Hercules, CA, USA).
PBMC and PMN cell isolation Primary cells were collected from two mixed breed animals (kept as part of the USMARC cattle population) that were each homozygous for the most common ITGB2 haplotype (variant "1"). For isolation of primary bovine cells, 50 ml of blood were collected by jugular puncture using 16-guage needles into syringes containing EDTA as an anticoagulant. PMN were isolated using a standard hypotonic lysis procedure. Briefly, blood was spun for 25 min at 1000 x g at 4°C. Plasma and buffy coat layers were removed and discarded. Sterile water was added to the red blood cell (RBC) layer to lyse RBC followed by addition of 10X PBS to restore tonicity. PMN were isolated by centrifugation for 10 min at 250 x g at 4°C. The PMN cell pellet was washed three times with 1x PBS and the final cell pellet was resuspended in RPMI 1640 medium.
Peripheral blood mononuclear cells (PBMC) were isolated essentially as described 41 . Briefly, PBMC were isolated over Ficoll-Paque Plus (GE Healthcare Bio-Sciences AB, Uppsala, Sweden), as per the manufacturer's instructions, with modification. Briefly, 15 ml of whole blood mixed 1:1 with PBS was underlayed beneath 14 ml of the density gradient in a 50 ml conical tube. The tubes were then centrifuged for 45 min at 900 x g at room temp and with no brake. The PBMC layer was carefully removed and brought up to 45 ml in PBS in a new 50 ml conical tube followed by centrifugation for 15 min at 400 x g at 4°C with high brake. Erythrocytes were removed using RBC lysing buffer (Sigma-Aldrich). The PBMC pellet was further washed three times with 1x PBS and the final pellet was resuspended in RPMI 1640 medium.

MTT dye-reduction cytotoxicity assay
The ability of different CD18 signal peptide variants to bind Lkt and inhibit Lkt-induced cytolysis was measured with the MTT dye-reduction cytotoxicity assay 10 . The BL3 cell line was selected because it is the most well-studied, readily available, immortalized cell line susceptible to Ltk-induced cytolysis. CD18 signal peptides were tested at concentrations ranging from 50 µM to 0.195 µM. CD18 signal peptides were diluted using serial 2-fold dilutions in 96-well round bottom plates containing 50 µl/well of Lkt at a 50% toxicity end point titer in colorless RPMI 1640 medium without phenol red (Sigma-Aldrich). Synthetic signal peptides and Lkt preparations were pre-incubated for 1 hr on ice prior to the addition of 5 × 10 5 cells/wells. These cells were added as a 50 ul suspension with a density of 1 × 10 7 cells/ml in colorless RPMI. Cells were incubated with synthetic signal peptides and Lkt for 1 hr at 37°C in 5% CO 2 and subsequently centrifuged at 600 x g for 7 min at 4°C and the supernatant was removed and discarded. Cells were resuspended in 100 µl of colorless RPMI and 20 µl 0.5% MTT (3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl-2H-tetrazolium bromide; Sigma-Aldrich) and incubated for 20 min at 37°C in 5% CO 2, centrifuged as before, and the supernatant was removed and discarded. The purple formazan precipitate was then dissolved in 100 µl of acid isopropanol (0.04N HCl in isopropanol). Following a 5 min incubation, cellular debris was pelleted and the supernatant was transferred to a new 96-well plate. The optical density (OD) of the samples was measured at 570 nm and 690 nm. The background measurement at 690 nm was then subtracted from the 570 nm measurement to give the background adjusted OD. The percent cytotoxicity was calculated as follows: (1-(OD of toxin treated cells/OD of cells without toxin)) x 100. The percent inhibition of cytotoxicity in the presence of synthetic CD18 peptides was calculated as follows: ((percent cytotoxicity in the absence of peptide -percent cytotoxicity in the presence of peptide)/ percent cytotoxicity in the absence of peptide) x 100. Given normal variability in the assay, values were occasionally obtained outside the range of 0 to 100% inhibition. Negative values were replaced with 0 and values greater than 100 were replaced with 100 for graphical presentation.
Fresh bovine PBMC or PMN were used for comparison to results obtained using immortalized BL3 cells. Cells were suspended in 50 µl colorless RPMI medium and 10 µl Biolog Redox Dye MB, containing 500 µM water-soluble tetrazolium (Biolog, Hayward, CA, USA). This dye was used because it was found to be more sensitive to changes in cellular respiration when assaying primary cells (data not shown). Cells were incubated for 3 hr at 37°C in 5% CO 2 , centrifuged as before, and the supernatant was transferred to a clean 96 well plate prior to the OD being measured at 590 nm and 750 nm. Calculations for percent cytotoxicity and percent inhibition were calculated using background adjusted OD values as described above.

Statistical analyses of IC 50 values
The half maximal inhibitory concentration (IC50) was estimated for each signal peptide variant to compare the concentration of peptide needed to block 50% of Lkt-induced cytotoxicity of susceptible bovine cells. Univariate nonlinear regression (NLIN procedure, SAS 9.4, SAS Institute Inc., Cary, NC, USA) was used to fit data from three replicates to the following equation: where Y ijk is the measured Lkt inhibition for the k th replicate of the i th peptide at the j th concentration (C j ) of peptide. T is the estimated maximum, B is the estimated minimum of the fitted nonlinear relationship between peptide concentration and Lkt inhibition. S i is the slope of the relationship at IC50 i , the estimated concentration that the i th peptide inhibits Lkt-induced cytotoxicity by 50%. e ijk is random variation. Standard errors of IC50 i were used for comparisons of estimates.

Results
Identifying ITGB2 polymorphisms affecting the predicted amino acid sequence of CD18 Bovine ITGB2 consists of 16 exons spanning 29.1 kb of genomic DNA and encodes a 769 amino acid protein with multiple functional domains ( Figure 1A,B). Using software to view the aligned genome sequences from 96 diverse bulls, 13 codons with 14 missense variants were identified (Extended Data Table S2) 42 . There were no frameshifts, splice sites, or indel polymorphisms observed that would affect the predicted amino acid sequence. Two ITGB2 regions of interest were further selected for additional Sanger sequencing in 1142 purebred cattle from 46 breeds: the signal peptide region (exon 2), and the region containing the D128G variant causing BLAD (exon 5). DNA sequence analyses revealed no additional polymorphisms that were predicted to alter the polypeptide sequence, except the previously described D128G variant in Holstein cattle (data not shown). The genotypes were independently verified with a single, multiplexed, MALDI-TOF MS assay for 14 SNPs in the 96 diverse bulls, and in 1142 of 1168 cattle from 46 breeds (Extended Data Table S3) 43 . Thus, in total, there were 14 missense variants identified in 13 codons (Table 1).
Determining ITGB2 haplotypes encoding different CD18 polypeptides Identifying haplotypes that encode distinct combinations of missense variants on the CD18 polypeptide is important for evaluating their potential function. A total of 15 ITGB2 haplotypes were identified that, when translated, were predicted to encode different CD18 proteins (Table 2). These 15 predicted polypeptide sequences were placed in the context of a maximum parsimony phylogenetic tree ( Figure 1C). Haplotypes encoding CD18 protein variants "1 to 7", "9", "13", and "14" were confirmed by their presence in homozygous animals. Haplotypes encoding CD18 protein variants "8", "10", and "15" were unambiguously confirmed in animals with only one heterozygous site. However, haplotype phase was ambiguous when the distance between two heterozygous sites exceeds the length of the DNA sequence read (150 bp in these WGS data sets). Thus, haplotypes for the remaining CD18 protein variants "11" and "12" were tentatively inferred from additional breed-level frequency information. For example, the inferred phase for variant "12" (P 155 L 656 ) was only observed in two of 27 Braunvieh cattle that were each heterozygous for both missense variants. However, all 25 of the other Braunvieh cattle sequenced were homozygous for the variant "1" (Table 3 and Extended  Data Table S3) 43 . Thus, it was reasonable to infer that P 155 (exon 5), and L 656 (exon 14) are present on one rare haplotype in two animals, rather than on two rare haplotypes in each animal. Similarly, the inferred phase for variant "11" (C 5 L 656 ) was only observed in six of 22 Wagyu cattle and each were heterozygous at positions 5 and 656. Since the 22 Wagyu cattle have a variant "1" frequency of 0.8 it seems likely that C 5 and L 656 variants are present on the same chromosome (i.e. diplotype "1,11") rather than split across two chromosomes (i.e., diplotype "5,15"). In spite of the potential for ambiguous haplotype phases with rare variants, the phylogenetic tree of predicted CD18 proteins provides a solid framework for further evaluation.

Evolutionary comparison of CD18 polypeptide sequences
Determining the most likely phylogenetic root of the CD18 tree is important for establishing the likely order of mutational  events. Comparing cattle CD18 precursor protein variants to those from closely related species in the Bos genus indicated that variant "3" (K 27 ) was the most likely root of the phylogenetic tree ( Figure 1C). Cattle are predicted to share a common ancestor with other species in the Bos genus approximately 5 million years ago. In addition, the K 27 variant was associated with indicine cattle breeds, while the N 27 variant was associated with taurine breeds. For example, the N 27 frequency in 840 cattle from 33 taurine breeds was 0.998, while the K 27 frequency in 70 Brahman, Indu Brazil, and Nelore cattle was 0.95. Thus, the structure of the rooted tree suggests that CD18 protein variants "1" and "3" are the ancestral polypeptide sequence of taurine and indicine breeds, respectively. The rooted tree also suggests that the distal nodes represent CD18 variants that have arisen sometime after the split between taurines and indicines, approximately 500,000 years ago.
The conservation of amino acid residues throughout vertebrate species is a measure of their potential impact on protein function. Highly conserved residues are more likely to be indispensable for function and thus, variation at these positions is assumed to be deleterious.   Figure 1C. b The red bold residues are those differing from "variant 1". c The coefficient of determination for these frequencies (r 2 ) was 99.7. d Allele not detected in the indicated group of cattle.  However, based on its high frequency in most taurines, the N 27 variant does not appear to be deleterious.
The conservation of arginine at positions 3 (R 3 ) and 5 (R 5 ) in CD18 signal peptides was of particular interest because this region binds bacterial Lkt in ruminant species, which includes the Bovids, Cervids, Giraffids, musk deer, chevrotains, and pronghorns. The R 3 residue was conserved throughout the Bovinae, but not in sheep and goat, which have proline at that position ( Figure 2). The histidine residue at position 3 (H 3 ) was rare in cattle, not observed in other ruminants, and on a distal node of the phylogenetic tree (CD18 polypeptide variant "7", Figure 1C). ). In addition, the C 5 variant was present on four distinct putative CD18 polypeptide variants: "9", "11", "13", and "15" (Figure 1C), including both taurine-and indicine-influenced breeds. The presence of the C 5 variant on multiple but infrequent haplotypes indicates recombination has occurred between this and other CD18 missense variants.
Comparison of batch production methods of biologicallyactive Lkt with two reference strains of M. haemolytica The discovery of missense variants in the CD18 signal peptide provided the opportunity to test these variants using in vitro cell assays with bacterial Lkt. However, producing sufficient and Hereford    Table S4). At variant sites in cattle, the residues were summarized for a representative subset of 35 species. TMRCA, estimated time to most recent common ancestor in millions of years 47 ; letters, IUPAC/ IUBMB codes for amino acids; dot, amino acid residues identical to those in cattle "variant 1"; dash, not enough sequence similarity for comparison or missing residue in that peptide region.
consistent batches of biologically-active Lkt from reference bacterial strains was a challenge with traditional cell culture medium (RPMI). In an effort to overcome this barrier, Lkt production in RPMI was compared with that in a semi-defined bacterial culture medium (SDM2, Methods) with two wild-type M. haemolytica strains. The total biological activity of Lkt excreted into SDM2 culture was up to 80-fold greater at its peak than that in RPMI for a given reference strain of M. haemolytica (e.g. Strain 183, Figure 3A). In both media, Lkt activity was induced in late log phase as batch cultures made the transition to stationary phase. Although the biological activity quickly diminished as the culture progressed to stationary phase, the total Lkt protein measured by SDS PAGE continued to increase and the level was stable for hours in the culture supernatant ( Figure 3B). The Lkt of clarified SDM2 culture supernatants with the highest cytolytic activity (Strain 183, fraction C, Figure 3B) was estimated to be 90% pure based on gel densitometry imaging of coomassie-stained SDS-PAGE gels. Preparations of similar quality were used for in vitro assays. The cytolytic activity of these toxin preparations was stable at -80°C for more than 2 years.
CD18 signal peptide variant C 5 has enhanced binding to bacterial Lkt Analysis of cattle ITGB2 haplotypes showed that three distinct polypeptide sequences are encoded in the 22-amino acid signal peptide region: the common variant with arginine at positions 3 and 5 (R 3 R 5 ), a rare variant with histidine at position three (H 3 R 5 ), and second rare variant with cysteine at position 5 (R 3 C 5 ). Synthetic 22-mer peptides representing these three, full-length, signal peptides were tested for their ability to bind Lkt. The synthetic peptides were pre-incubated with M. haemolytica Lkt preparations to allow binding, and applied to Lkt-sensitive BL3 cell cultures in vitro ( Figure 4A). The common R 3 R 5 signal peptide was used as a reference since it has a frequency of approximately 0.98 in U.S. cattle, and is predicted to be present on 10 of the full-length CD18 sequences. The synthetic R 3 R 5 signal peptide had a IC50 of 17.9 µM, which represents the concentration of peptide needed to block 50% of Lkt-induced cytolysis of BL3 cells. The rare H 3 R 5 signal peptide variant, which is only found on one full-length CD18 variant ( Figure 1C, variant 7), was similar to the reference, with an IC50 of 21.6 µM. In contrast, the rare R 3 C 5 signal peptide found on full-length CD18 variants 9, 11, 13, and 15 had an IC50 of 5.9 µM, which was 3-fold lower that the reference, indicating an increased affinity for Lkt ( Figure 4A).
The optimum blocking of Ltk-induced cytotoxicity has been previously reported to occur with the 13-mer peptide corresponding to CD18 signal peptide residues 5 to 17 10 . Thus, we tested the effect of cysteine at position 5 (C 5 ) in this shorter peptide. The synthetic 13-mer C 5 peptide was 3.6-fold more effective at blocking Lkt toxicity compared to the 22-mer R 3 C 5 peptide (IC50 of 1.6 and 5.9 µM, respectively). When compared to the 13-mer reference peptide with arginine at position five (R 5 ), the 13-mer C 5 peptide was 8-fold better at blocking Lkt-induced cytotoxicity (IC50 13.0 and 1.6 µM, respectively; Figure 4B). A negative control 13-mer peptide containing randomly assorted amino acids from the reference peptide sequence failed to inhibit Lkt-induced cytolysis, even at the highest concentration tested (50 µM), indicating that inhibition of cytolysis was sequence specific (Extended Data File S2) 48 . Together, these results suggest that peptide sequence and length affect Lkt binding.
Synthetic C 5 variant CD18 signal peptide blocks Lktinduced cytolysis of primary bovine cells The 13-mer C 5 synthetic signal peptide containing CD18 residues 5 to 17 were also tested for their ability to inhibit Lktinduced cytolysis of primary cells isolated from cattle. The purpose was to demonstrate that the reduction in cytotoxicity observed with C 5 signal peptides was comparable between the immortalized cell line and freshly isolated leukocytes from cattle. Like the BL3 cell line, the animals used as donor were homozygous for CD18 variant "1", and thus have the reference (R 3 R 5 ) signal peptide. With primary PBMC ( Figure 5A) or PMN ( Figure 5B), the 13-mer C 5 synthetic signal peptide was significantly better at blocking Lkt-induced cytotoxicity compared to the R 5 signal peptide (7-and 14-fold respectively) and was similar to that for BL3 cells (8-fold).
Arginine at position 4 (R 4 ) in the eland CD18 signal peptide disrupts the enhanced Lkt binding attributed by C 5 Some bovid species have CD18 signal peptide sequences that are slightly different from those in cattle ( Figure 6A and Table S3). In water buffalo, the 13-mer peptide sequence corresponding to CD18 signal peptide amino acids 5 to 17 differs at position 16 (S 16 ), while the same region in sheep differs at positions 10 and 12 (F 10 S 12 ). However, synthetic peptides corresponding to the water buffalo and sheep sequences were similar to the R 5 reference cattle signal peptide when tested for their ability to bind Lkt and inhibit cytolysis of BL3 cells in vitro. In contrast, the eland signal peptide was variant at three positions in this region compared to cattle (C 5 V 8 G 16 ), and its 13-mer synthetic signal peptide had a 24-fold decrease in IC50 compared to the reference cattle R 5 signal peptide and a similar IC50 compared to the reference cattle C 5 signal peptide ( Figure 6A). However, if the synthetic peptide is expanded to include the eland residue at position 4, the results were different. Eland have an arginine residue at position 4 (R 4 ) where other ruminants have a glutamine (Q 4 ). A synthetic 14-mer signal peptide with R 4 (i.e., Eland R 4 C 5 V 8 G 16 ) caused a 66-fold reduction in the ability of the eland signal peptide to bind Lkt compared to the 13-mer (C 5 V 8 G 16 ) as measured by IC50 ( Figure 6B). When R 4 was replaced with the Q 4 normally found in cattle and other ruminants (i.e. Eland Q 4 C 5 V 8 G 16 versus R 4 C 5 V 8 G 16 ), Lkt binding was restored. Similarly, when Q 4 in the cattle signal peptide (Q 4 C 5 ) was replaced with R 4 (R 4 C 5 ) there was a significant reduction in the ability of this peptide to bind Lkt (5.6-fold increase in IC50). These results suggest that the amino acid position 4 in the signal peptide can affect Lkt binding and that R 4 amino acid sequence in eland may disrupt the enhanced Lkt binding conferred by the C 5 variant residue.
A truncated CD18 C 5 signal peptide has high affinity for Lkt Toxin inhibitors represent a potentially potent class of therapeutics that could protect animals during acute lung infection.
Since truncated synthetic C 5 signal peptides showed increase  affinity for Lkt, various peptide lengths were tested to identify those with maximum Lkt binding. Removing the first four N-terminal amino acids (MLRQ) resulted in no significant difference in Lkt binding for the C 5 or reference R 5 signal peptides (Extended Data Figure S2) 42 . In contrast, stepwise deletions of C-terminal residues of the C 5 signal peptide had a significant impact on Lkt binding with the highest affinity being a 12-mer C 5 signal peptide with amino acids 5 to 16. The IC50 of this 12-mer C 5 peptide (CPQLLLLAGLLA) was 23-fold lower than the 13-mer reference R 5 signal peptide (residues 5 to 17) and 30-fold lower than the 12-mer R 5 signal peptide. Thus, the 12-mer C 5 peptide (CPQLLLLAGLLA) represents the minimal naturally-occurring peptide sequence with maximal inhibition of Lkt-induced BL3 cell lysis (Figure 7).

Discussion
The present report describes bovine CD18 amino acid sequence differences encoded by ITGB2 in 46 breeds of beef and dairy cattle. All of the protein coding variants discovered were missense mutations and their haplotypes were predicted to encode 15 distinct polypeptide sequences. A C 5 variant in the CD18 signal peptide region was shown to cause increased binding to M. haemolytica Lkt, a secreted toxin that causes cell lysis and acute inflammation leading to lung injury characteristic of bovine respiratory disease. The C 5 signal peptide variant increased the affinity for Lkt, and this effect was influenced by variation at adjacent residues. The increased Lkt binding and protection from cytotoxic effects were observed in the immortalized BL3 cell line, and freshly isolated PBMC and PMN from beef cattle.
The identification of naturally-occurring CD18 variants with increased binding to Lkt has important potential implications for animal health. For example, cattle with the CD18 C 5 signal peptide variant may be at increased risk for toxin-related respiratory disease. However, despite the fact that the C 5 variant was found in four predicted protein variants in taurine and indicine cattle, its overall frequency in the U.S. cattle population in still very low (0.01). Thus, identifying available homozygous cattle for testing their leukocytes ex vivo for altered Lkt sensitivity will be challenging. Determining whether this altered binding phenotype contributes to differences in lung lesion severity or disease outcome following M. haemolytica infection will also be difficult. Together, these factors suggest that identifying and removing animals with the CD18 C 5 signal peptide would be premature and unwarranted for most cattle operations at this time.
Recent examples of gene editing in animals have shown that this can be a successful strategy for creating novel host genetic resistance. Groundbreaking work with porcine reproductive and respiratory syndrome virus (PRRSV) has shown that gene editing of a critical entry factor (CD163) confers complete resistance to infection in pigs 49,50 . Similarly, genetic resistance to M. haemolytica Lkt has been demonstrated in leukocytes isolated from a homozygous, gene-edited, bovine fetus expressing a cleavable CD18 signal peptide 15 . However, the uncleaved CD18 signal peptide is universally conserved in ruminants and thus, its removal may have deleterious effects on the animal. CD18 forms heterodimers with distinct, but structurally homologous alpha integrin subunits (e.g., CD11a/CD18, CD11b/CD18, and CD11c/CD18), and thus the effect of a cleaved signal peptide may have unknown, but far-reaching effects on biological functions. To date there are no reports of a healthy, live calf expressing a cleavable CD18 signal peptide. Thus, modifying the Synthetic 14-mer signal peptides representing CD18 amino acids 4 to 17 from eland and cattle variant C 5 were tested for their ability to bind Lkt. In addition, eland CD18 signal peptides were synthesized where the amino acid at position 4 in eland (arginine, R) was replaced with the amino acid normally found in cattle at this position (glutamine, Q; Eland Q 4 C 5 V 8 G 16 ). Similarly, cattle variant C 5 peptides were synthesized where the amino acid at position 4 in cattle was replaced with the amino acid naturally encoded in eland (Cattle R 4 C 5 ). The half-maximal inhibitory concentration (IC50) for each peptide was determined using non-linear regression analyses. Data are expressed as the mean with standard deviation (n=3 or 4).

Figure 7. C-terminal truncations reveal the minimal peptide sequence for inhibition of leukotoxin (Lkt)-induced cytolysis.
Synthetic CD18 signal peptides were synthesized with single amino acid C-terminal deletions. These peptides were tested for their ability to inhibit Lkt-induced cytolysis of bovine BL3 cells. Peptides were tested at concentrations ranging from 50 µM to 0.195 µM. The half-maximal inhibitory concentration (IC50) for each peptide was determined using non-linear regression analyses. Data are expressed as the mean with standard deviation (n=3).
amino acid sequence of an uncleaved ruminant CD18 signal peptide may be useful as an alternative strategy to reduce Lkt binding, while preserving its normal evolutionarily conserved cellular function.
Identifying the naturally-ocurring CD18 signal peptide residues that influence Lkt binding provides a guide for more extensive analyses that may inform gene edits. Previously, synthetic signal peptides containing the R 5 residue were used to identify a 13-amino acid minimum CD18 binding site for Lkt (spanning residues 5 through 17, 10 ). Using synthetic peptides representing cattle Q 4 C 5 and eland R 4 C 5 , we showed that residues in position 4 can drastically affect Lkt binding and that the net charge of the signal peptide is important. The positively charged side chain of R 4 apparently disrupts the enhanced binding of C 5 signal peptides compared to the neutral side chain of Q 4 . In addition, the most common CD18 22-mer signal peptide (R 3 R 5 ) and the rare H 3 R 5 signal peptide have a net charge of +2 and reduced Lkt binding compared the R 3 C 5 peptide (net charge of +1). It would be of interest to determine whether increasing the signal peptide net charge to +3 by introducing another positively charged residue could further reduce Lkt binding. However, there are numerous combinatorial peptide possibilities that could lead to a significantly reduced affinity of CD18 for Lkt, while only a few may preserve the basic biological functions of the signal peptide. Although large-scale screening of candidate peptides in vitro and testing them in vivo is beyond the scope of this study, our results suggest this as a possible future avenue of research.
An alternative strategy for reducing the impact of Lkt would be to block its activity in vivo with synthetic CD18 "decoy" peptides such as the 12-mer C 5 signal peptide identified here. A similar strategy has been used to neutralize anthrax toxin from Bacillus anthracis (reviewed in 51). A synthetic 12-mer peptide antitoxin, attached to liposome scaffolds in multiple copies, protected host cells from cytotoxicity in vitro and protected rats from becoming moribund in vivo 52 . Although the feasibility of delivering decoy peptides is unknown in cattle, it could be used to neutralize Lkt, and thus protect calves from the major virulence factor associated with lung pathology in bovine respiratory disease complex. One can further imagine combining decoy peptide technology with gene editing to have alveolar leukocytes secrete decoy peptide inhibitors of Lkt at the sites of M. haemolytica infection. Although the theoretical possibilities of using antitoxin and gene editing strategies may be vast, the feasibility of these technologies are still relatively unknown. A better understanding of the underlying molecular mechanisms involved, together with significant improvements in livestock gene editing and decoy peptide technologies, are needed to move the field forward.

Conclusion
There are more than a dozen missense variants in the CD18 polypeptide, including a C 5 variant in the signal peptide that affects Lkt binding. Results in vitro suggest that animals carrying the C 5 allele may be more susceptible to the effects of Lkt. These results also identify a potentially potent class of non-antibiotic Lkt inhibitors that could protect cattle from cytotoxic effects during acute lung infections.

Data availability
Underlying data Whole genome sequence files (FASTQ) for the BL3 cell line are available in the NCBI SRA under accession number SRX4645762.
The BL3 sequence data have also been deposited with links to BioProject accession number PRJNA325058 (BioSample SAMN05217649) in the NCBI BioProject database.
In addition, access to the aligned sequences is available via the USDA internet site: https://www.ars.usda.gov/plains-area/claycenter-ne/marc/wgs/celllines/ as described in the Methods.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. resistant to Mannheimia hemeolytica in an in vitro assay. The current authors correctly note that there are no reports of live born calves carrying this edit in homozygous form and thus the viability of this gene edit for resistance to Mannheimia hemolytica remains unproven. However, none of the natural ITBG2 variants characterised by the authors delivers resistance. As the authors search for variation was thorough and should have identified almost all variants of useful frequency, then designing variant signal peptide sequences that are not cleaved but do not bind the leukotoxin may be the way forward. The authors' discussion of these points could be more explicit.

Is the work clearly and accurately presented and does it cite the current literature? Yes
Is the study design appropriate and is the work technically sound? Yes

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Yes I have undertaken collaborative research with Dr Smith on topics unrelated to this Competing Interests: manuscript and we published an article together in 2009 and conference abstracts more recently.
Reviewer Expertise: molecular genetics, gene editing I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.