Spatial Transcriptomics Dimensionality Reduction using Wavelet Bases

Spatially resolved transcriptomics (ST) measures gene expression along with the spatial coordinates of the measurements. The analysis of ST data involves significant computation complexity. In this work, we propose gene expression dimensionality reduction algorithm that retains spatial structure. We combine the wavelet transformation with matrix factorization to select spatially-varying genes. We extract a low-dimensional representation of these genes. We consider Empirical Bayes setting, imposing regularization through the prior distribution of factor genes. Additionally, We provide visualization of extracted representation genes capturing the global spatial pattern. We illustrate the performance of our methods by spatial structure recovery and gene expression reconstruction in simulation. In real data experiments, our method identifies spatial structure of gene factors and outperforms regular decomposition regarding reconstruction error. We found the connection between the fluctuation of gene patterns and wavelet technique, providing smoother visualization. We develop the package and share the workflow generating reproducible quantitative results and gene visualization. The package is available at https://github.com/OliverXUZY/waveST.


Introduction
Spatial resolved transcriptomics (ST) is a technology measuring spatial variation in gene expression.Different technological platforms support different genome coverage and spatial resolution (tissue-level measurement to subcellular measurement).Examples of ST platforms include Spatial Transcriptomics [Ståhl et al., 2016;Xia et al., 2019], 10x Genomics Visium, Slide-seq [Rodriques et al., 2019], sci-Space [Srivatsan et al., 2021], and MERFISH [Chen et al., 2015].Spatial transcriptomics allows visualization and quantitative analysis of the transcriptome with spatial resolution in individual tissue sections.A series of studies combining gene expression and spatial information has been brought to generate new insight in biological analysis [Kuppe et al., 2020;Shah et al., 2016;Berglund et al., 2018].
Quantification of gene expression has wide applications in transcriptomics.Understanding the spatial distribution of gene expression has helped to answer fundamental questions in developmental biology [Asp et al., 2019;Rödelsperger et al., 2021], cancer [Thrane et al., 2018;Moncada et al., 2020], and neuroscience [Moffitt et al., 2016;Close et al., 2021].Two widely-used methods for gene expression quantification are fluorescent in situ hybridization (FISH) and next-generation sequencing.
While these approaches have been made to measure gene expression while preserving spatial information, there are statistical challenges in analyses combining the gene and spatial information.Specifically, gene dimensionality reduction guided by spatial information is still an active area of study.In this work, we conduct a dimensionality reduction on spatially varying gene data.After transformation, gene expression over locations is a square matrix, which allows us to view it as an image.Therefore, tools from image processing can be adapted -our method incorporates spatial information by wavelet transformation, a multi-scale analysis decompose sequence into orthonormal series.We apply wavelet transformation for each gene expression over locations.This is a common technique in denoising the image.We thus apply techniques analogous to image analysis.We evaluate performance using reconstruction error and comparing resulting visualizations.Our analysis pipeline build from off-the-shelf tools, and code is available for reproducing our results.We also attach visualizations in our pipeline, giving example interpretations of intermediate result.
A natural idea in spatial transcriptomic analysis is to identify gene types across spatial locations [Abu-Jamous and Kelly, 2018].For example, in clustering the representations in each cluster achieve dimensionality reduction under the assumption that genes within one cluster have the same type.A parallel technique is clustering based on cell type instead.General clustering methods can be combined into more sophisticated pipelines tailored toward spatial single-cell analysis.SC3 [Kiselev et al., 2017] is an ensemble clustering method.It calculates distance matrices across cell locations using the Euclidean distance, then applies spectral clustering, and assigns membership [Kiselev et al., 2018].To identify the cell type in each cluster, one can perform differential expression analysis between all pairs of clusters.scGeneFit uses a label-aware compression method to find marker genes [Dumitrascu et al., 2021].Given the cell-by-gene expression matrix and cell clustering membership, scGeneFit maps cells to lower-dimensional space where cells within the same cluster are closer.For gene dimensionality reduction, [Zhu and Sabatti, 2020] constructs a neighborhood graph from the spatial coordinates, then applies a graph-based feature selection procedure to determine spatially varying genes.They also provide the option to infer a latent graph embedding for cells based on selected genes, applying spline models to fit the gene's expression on the latent embedding.Then they leverage the fitted coefficients to reduce the dimensionality of each gene.[Svensson et al., 2018] proposed pipelines using mixed-effect models incorporating spatial information.The model contains two random effect terms: a spatial variance term that parametrizes gene expression covariance by pairwise distance between samples and a noise term that models nonspatial variability.The model leverages efficient inference methods previously developed for linear mixed models, and it is computationally efficient.
Our setting is closely aligned with the following recent works: [Shang and Zhou, 2022] developed SpatialPCA, applying probabilistic PCA on ST data for dimensionality reduction.They assume data are given as a location-by-genes matrix and construct a regression model similar to factor analysis, where the prior covariance matrix of factor genes is a distance matrix constructed with a Gaussian kernel.[Velten et al., 2022] proposed MEFISTO, combining factor analysis with the non-parametric framework of Gaussian processes to model spatio-temporal dependencies in the latent space.[Townes and Engelhardt, 2021] developed nonnegative spatial factorization (NSF), combining a Gaussian process prior over spatial locations and a Poisson or negative binomial likelihood for count data, identifying generalizable spatial patterns of gene expression.All these works impose spatial structure on the prior of the factor genes.While these methods offer new dimensionality reduction techniques to cluster the genes, the complex model structure and a large number of hyperparameters introduce uncertainty and noise.Instead of imposing structural assumptions on the prior of the factor genes, we impose structure on the factor gene itself.Our contributions are the following: • We propose an approach, based on techniques from matrix decomposition and image signal processing to perform gene dimensionality reduction that retains inferred spatial structure.
• We run simulations showing that wavelet-guided dimensionality reduction performs better estimation than the singular value decomposition (SVD) under low signal-to-noise (SNR) regime.
• We perform real data experiments to identify the connection between wavelet techniques and fluctuation of the gene expression, which would be useful in selecting spatially related genes based on reconstruction error.
• We provide a gene extraction pipeline capturing the global information of spatially related genes.We provide smoother visualization of gene factors via wavelet methods.We develop an R package waveST and share the workflow generating reproducible quantitative results and gene visualization.
The diagram for workflow can be seen in Figure 1.The paper is organized in following: In Section 2 we introduce background on required techniques, including the wavelet transformation and matrix decompositions.In Section 3, we formally define our problem under this setting, and section 4 introduces our algorithms and analysis pipeline.In section 5, we implement simulations showing the effect of wavelet transformation in reducing error.In section 6 we conduct our method on data from [Weber, 2021], showing the reconstruction error in dimension reduction and visualization of lower-dimensional representations.

Background
In this section, we will cover background of methods we used in data pre-processing and analysis.
Figure 1: A summary of the proposed workflow.

Wavelet Transformation
Density estimation and function approximation is a fundamental problem in statistics and machine learning.Nonparametric methods such as spline regression [Perperoglou et al., 2019], Fourier transformation [Cochran et al., 1967] and Wavelet transformation [Nason, 2008] has been used in such scenarios.
Consider a model with one predictor: y = f (x) + , where E 2 = σ 2 .We want to estimate f , the trend of the function.
We assume the predictors are ordered x 1 < x 2 < . . .< x n .We have a signal or frequency to estimate.
Consider the Haar mother wavelet: Unlike the Fourier basis, wavelets oscillate and decay fast, only contributing to a certain local area and zero elsewhere.We can generate wavelets from the Haar mother wavelet ψ j,k (x) = 2 j/2 ψ 2 j x − k for integer j, k.These wavelets form orthonormal sets.We can decompose the trend as Wavelet based methods have several advantages among other non-parametric methods, especially dealing with sparse data.One property wavelet has is localization.If a sequence has a discontinuity, this will only influence the wavelet basis around it.In contrast, for a Fourier basis consisting of sine and cosine functions at different frequencies, every basis element will interact with this discontinuity, hence influencing every Fourier coefficient.
The simplest discrete wavelet transformation calculates the difference and sums between each adjacent pair.Suppose we have one vector with length n, where n is dyadic (n = 2 J ).We computed d J−1,k = y 2k − y 2k−1 as finest-level detail and c J−1,k = y 2k + y 2k−1 as the finest-level averages.We have {d k } containing n/2 = 2 J−1 as our finest level coefficients.To obtain the next coarsest coeffients we set: We continue this procedure until j reaches one.The set of details {d k } across levels are our wavelet coefficients.
We use J to denote the scale of wavelets, with larger J we have finer scale and approximation.However, finer scale sometimes introduces more parameters to capture minor details of the sequence (overfitting).The trade-off in choosing J is crucial in the wavelet transformation.
Regularization and smoothing is can be used to prevent overfitting.Regularization is usually conducted by shrinking wavelet coefficients.The concept of wavelet shrinkage was first proposed by [Donoho and Johnstone, 1994].The motivation behind wavelet shrinkage is straightforward.Consider empirical wavelet coefficients, the large coefficients usually contain true signal and noise, whereas the small coefficients only contain noise.The shrinkage is often used when the wavelet coefficients are assumed to be a sparse vector.
Wavelet shrinkage is often conducted by setting a threshold (denoted by δ) and only keeping the coefficients above the threshold.To choose a threshold, a natural metric is the squared error between estimated function and the truth: Donoho and Johnstone [1995] also proposed Sure threshold based on Stein's (1981) unbiased risk estimator.The optimal SURE threshold can be obtained in O(n log n) operations.Donoho and Johnstone [1995] also noted that SURE sometimes failed when the true signal coefficients are highly sparse, they proposed a hybrid scheme called SureShrink, combining the SURE and universal thresholds, using them depending on certain situations.
The extension of wavelet methods to 2D regularly spaced data (images) and such data in higher dimensions was proposed by [Mallat, 1989].We only consider 2-D wavelet transformation since we decompose 2-D spatial gene expression data.Suppose we have n × n matrix A where n = 2 J is dyadic.A simple discrete wavelet transformation on A first applies procedure (1) and ( 2) to the rows of the matrix.We then have two matrices of size n × n 2 , called H and G. Then we apply the same procedures to both the columns of H and G, resulting in four matrices HH, GH, HG, and GG each of size n 2 × n 2 .These are our finest level coefficients.HH is the local average of the original matrix used for the next level procedure.

Factor Gene
Clustering and dimensionality reduction are widely used in genomics.In a gene-by-sample matrix, genes are often grouped into profiles, where genes from the same profile have a similar function.Statistically, we can treat them as correlated variables.We use the term factor genes as the principal component in our gene-by-sample data.Factor genes compose the linear combinations of genes.We consider each gene as a variable, and we aim to find variables (f 1 , . . ., f K ) such that each gene g i has form where a j are coefficients.Suppose we have gene(P )-by-sample(N ) matrix A ∈ R N ×P with sample covariance matrix S = 1 n ÃT Ã, where Ã is the column centered A. Consider the SVD on Ã = U ΛV T .Then we have SV = V Λ 2 .The columns of V are called eigenarrays [Alter et al., 2000].The first K columns of Z = AV are the factor genes.The factor genes capture the mutual underlying information of genes.

Empirical Bayes Matrix Factorization
Matrix Factorization is often used in capture factor genes.We have a formulation similar to [Wang and Stephens, 2021], consider the factorization model on observed samples by gene data where F ∈ R N ×K denotes the factors, L ∈ R P ×K denotes the loadings for each factor, and E ∈ R N ×P denotes Gaussian noise with zero mean.Typical formulations would treat factors and loadings as fixed effects and use Maximum Likelihood Estimation (MLE).In our high-dimensional setting, penalty-based regularizers are often considered.Considering a prior for L and F under Bayesian setting has a similar effect and several advantages over adding a regularizer alone, such as simplifying hyperparameter search and selection of the number of factors K.
One feature of empirical Bayes approaches in matrix factorization proposed by [Bishop, 1999] is that the methods automatically select the number of factors K.In [Wang and Stephens, 2021], the author add factors with prior one at a time, estimating priors at each step until convergence.If the computed prior of the newly added factor is almost point mass on 0, the algorithm eliminates this factor and returns.
Following [Wang and Stephens, 2021], we consider Empirical Bayes in our setting, where we set a prior with unknown parameters of L and F .Empirical Bayes is not strictly Bayes, since the prior parameters are directly estimated by the MLE of the data.One example would be normal distribution D F of F and D L of L where coordinates are independent, which is conjugate prior.We estimate parameters in D F and D L by the maximize the marginal likelihood calculated by integral out L and F .Then we computed the posterior distribution of L and F .

Problem Setup
Consider Y ∈ R N ×P , where each row represents a sample and each column represents a gene.Further, let S ∈ R N ×2 store spatial coordinates of each sample.We assume there are spatially-related genes among all genes.We assume the genes fall into several profiles.The genes in the same profile have the same expression over the sampled spatial context.For each profile, we can summarize the gene expression for such group by one representative factor.Consider there are K profiles, we can then form our model as: where f k ∈ R n is the factor gene in profile k capturing the gene expression pattern in that group, l k ∈ R p indicates the loading coefficients of each factor.We assume there is random noise E ∈ R n×p in observations and E ij iid ∼ N (0, 1/τ ).The goal is to select spatially-related genes based on spatial information in S. We also extract the gene factors which capture information across all genes based on feature extraction.We aim to find a latent gene space that respects spatial structure.

Methods
In this section, we build up pipelines and models to achieve this goal.We adapt matrix decomposition methods to incorporate spatial information.We first preprocess gene expression to make them amenable for wavelet filtering.We then use the Daubechies D4 Wavelet Transform [Daubechies, 1992] as wavelet filter.

Gene Expression Over Location
We leverage a pre-processing step, which we call input generation, to combine spatial and expression data sources.We have both gene expression measurements and spatial coordinates for each sample.Each gene has an expression over each sample, and we can draw the sample over a 2D map by their coordinates.As shown in Figure 2, the gene with the least sparsity in expression shows varying levels of expression across different spatial regions.Intuitively, we may consider the spatial expression pattern from this gene to be spatially related.However, most genes are sparse and do not show fluctuations in gene expression.Therefore, we first filter out those genes using a kOverA filter: we only keep genes that have an expression measure above A in at least k samples [Gentleman et al., 2021].This has the effect of removing genes that are rarely active, though it is possible that strongly expressed genes still show no spatial relationships.
We want gene expression data to have a neat grid structure that can be expressed as a matrix form.However, at first, expression measurements are roughly staggered over the spatial locations, as shown in Figure 3.This is a consequence of the sampling strategy adopted by the 10x genomics Visium platform.To obtain an evenly resampled version of the expression pattern, we divide the two-dimensional space into several partitions and compute the local average of each partition.Detailed are shown in Algorithm 1.Each gene has an expression over the grid and we now can use a matrix Z ∈ R D 2 ×P to represent it.The matrix Z becomes the new input for analysis.We then obtain the same formulation as in Section 3, where we take n = D 2 and each location corresponds to a new observation.We have f k ∈ R D 2 , k = 1, . . ., K as the factor genes, computed by vectorizing the gene expression matrix.The choice of D depends on the wavelet scale used in the next section.We choose D to be dyadic to prevent handling edge cases in the wavelet transformation (recalling Section 2.1).The dyadic size is also a natural choice in image analysis.In particular, we choose D = 2 J , where J is the level of scale in the wavelet transformation.Larger J allows finer recovery, but also requires more parameters and may result in overfitting.

Wavelet Transformation and Shrinkage
We apply a wavelet transformation with shrinkage to denoise the gene expression matrix and smooth observed spatial expression patterns.In simulations, we find that this technique gives more accurate recovery as the signal-to-noise ratio decreases and yields less noisy visualizations for the spatially-related genes.
We implement wavelet transformation and shrinkage on the processed data matrix Z ∈ R D 2 ×P .Each column of Z is a vectorized expression matrix.For each column, we first reshape it into a D × D matrix and apply the wavelet transformation.We have a coefficient list with W coefficients associated with each gene.Then we conduct wavelet shrinkage on the coefficient list using the threshold strategies described in Section 2.1.Finally, we vectorize the coefficient list of each gene into a length W vector for each gene.Stacking all vectors together, we have a coefficient matrix C ∈ R W ×P .Note that the wavelet scale is specified by the size D of the input matrix, i.e., we have D = 2 J , where J is the scale.The details are specified in Algorithm 2.

Matrix Decomposition
The matrix of the shrunk wavelet coefficients C gives a summary of the denoised matrix.This allows improved reconstruction of gene expression data.To obtain a low-dimensional approximation, we apply matrix factorization on the coefficient matrix C after transformation.We use the same notation in Section 3, with the subscript c to denote the decomposition on coefficient matrix C.
The resulting singular vectors can be used to estimate spatially structured factor genes (Section 2.2).We use SVD as a frequentist approach to estimating Fc and Lc .We also conduct EBMF (Section 2.3), using the posterior expectation of F c and L c to estimate Fc and Lc .EBMF can select the number of factors K by itself, whereas K must be manually specified in the SVD.The choice of K is informed by inspecting the scree plot, as in spectral clustering and PCA.We choose the number of factors when current factors explain sufficient information and the diminishing returns of additional factors are no longer worth the additional cost.Given Fc and LC , we can compute the estimated coefficient matrix as Ĉ = Fc LT c .

Inverse Wavelet Transformation
To transfer the coefficient matrix Ĉ back to the estimated location-by-gene matrix Ẑ, we apply the inverse wavelet transformation on columns of Ĉ, each of which is a vectorized coefficient list for one gene.This results in a D 2 expression matrix for each gene.Then we vectorize the matrix and stack all vectors together.This yields the reconstructed matrix Ẑ.Details are given in Algorithm 3.For visualization, we also conduct a similar process for each column of Fc .In this case, each column is associated with an factor gene.By applying the inverse wavelet transformation to each factor gene, we can build spatial gene expression matrices M 1 , . . ., M K representing gene factors.

Algorithm 1 Input Generation
Require: Sample by gene matrix Y ∈ R N ×P , spatial matrix S ∈ R N ×2 , size of gene expression matrix D 1: Compute the range of x, y coordinates from S, compute the coordinates of vertices of big rectangle map B cover all N samples spatially.Vectorize Ĝi into a long vector ĝi with length D 2 6: end while 7: Stacking all { ĝi } P i=1 together into matrix Ẑ ∈ R D 2 ×P 8: return matrix Ẑ Post processing reconstructed gene expression matrix

Evaluation
We evaluate our proposal both quantitatively and qualitatively.Our quantitative results measures reconstruction error between the estimated Ẑ and Z using the Frobenius norm Ẑ − Z 2 F .This evaluation is also used for hyperparameter tuning, such as the scale J of the wavelet and the choice of wavelet thresholding method.Our qualitative result is given by visualization of the estimated factor gene expression matrices, with emphasis on capturing global spatial structure.
We use cross-validation in computing reconstruction error and calculating gene-wise errors.This evaluation is helpful in selecting spatially related genes.We also found a simple connection between the gradient of gene expression and spatial contribution, discussed in Section 6.This discussion requires a measure of spatial expression smoothness.To this end, a simple step computing the fluctuation of gene expression would give a spatial gene selection that coincides with the reconstruction error selection.For calculating successive differences, consider Z as the input matrix and let δ jk = (z j,k+1 − z jk ) 2 , k = 1, . . ., D 2 .Then we define the gradient by jk δ jk .

Simulation
We setup simulations to see whether representing spatial structure with a wavelet basis supports denoising and visualization of gene expression.We will see improved recovery in the low signal-to-noise ratio regime.Qualitatively, we also find factor gene visualizations to be more spatially consistent.
We first describe the simulation mechanism.We the set number of factors to K = 9.We generate K gene expression matrices , each representing a factor gene.These factor gene expression patterns are shown in Figure 4. We vectorize the patterns M k into factor genes f k , which are then scaled so f k = 1.We obtain F ∈ R D 2 ×K by stacking all the f k together.We generate loadings l 1 , . . ., l K ∈ R P ×K by drawing coordinates independently from N (0, 1).We set D = 32 and wavelet scale J = 5.We similarly stack l k to obtain L ∈ R P ×K .We use Z = F L T as the ground-truth signal matrix and add noise E to yield data matrix Z d = F L T + E, where entries in E are zero-mean normal noise that corrupt the underlying signal.In our analysis, we use a Daubechies D4 Wavelet as the wavelet filter, and for coefficient shrinkage we use hybrid thresholding.
We use the pipeline from Section 4. The generated data Z d already reflects the structure produced by the processing of Section 4.1.For comparison, we also directly decompose the matrix Z d with the SVD without any wavelet transformation.We call the resulting factors Fraw and reconstruction Ẑraw .We also have the same quantitative reconstruction error and qualitative visualization of factor gene Fraw .We also measure the size of the gradients across the estimated gene expression matrix Mi .The gradient is computed by successive difference of neighboring image pixels.We calculated the sum of the squares of the gradient.This measurement shows whether the gene expression matrix has been smoothed.This property is of interest, since smoother estimates are often more visually appealing.
We denote the signal-to-noise ratio as SNR = sd (Z)  sd(E) , where sd stands for standard deviation.Let r = 1 SNR .We specify 19 evenly spaced settings of r from 1 to 10.For each setting, we run 100 replicates with different simulated Z d and apply wavelet and SVD-based dimensionality reduction.The resulting average errors across r are shown in the Figure 5.The wavelet and shrinkage technique has better performance when r is larger than 5, i.e., the low signal-to-noise ratio regime.The gradient of the gene expression matrix under two methods is shown in Figure 5.The wavelet and shrinkage approach smooths edges in the factor gene expression image, giving a more interpretable visualization and lower error in this low SNR setting.
The factor genes are shown in Figure 6.The SVD without wavelet technique has erratic, outlying pixels, especially in the last three genes.The visualization is sensitive to outliers.In contrast, the SVD combined with wavelet has smoother patterns.Like the SVD-based method, it appears to have mixed several of the true underlying factors in each of the recovered ones.Moreover, the sharp boundaries visible in the SVD factors become smoothed over in the waveletdecomposition.The wavelet method applies a decomposition on coefficients space after thresholding, while SVD operates on individual pixels.The SVD capture more information, but also emphasize nuisance information induced by errors.Wavelet method also reduces model complexity, improving estimation accuracy.Other non-parametric methods, such as Fourier transformation and Gaussian Process-based methods also operate on coefficients space.Still, they would struggle to capture sharp transitions, since their bases are smooth functions.
Nonetheless, both the SVD and wavelet-based visualizations reflect spatial trends in the true factor genes (in Figure 4).The wavelet thresholding approach smooths over edges in the original visualization.The first factor gene captures the global spatial expression.We have a similar result for the second factor gene, as shown in Figure 12.The second factor gene is orthogonal to the first one, capturing different spatial structures in gene expression.Different factor genes capture global and local structure, and using a wavelet decomposition provides denoised spatial expression visualization.
In conclusion, we can use this dimensionality reduction technique for spatial gene selection and extraction.We select wavelet-suited genes based on the calculated gradient.Then, we can select spatially related genes based on reconstruction error via cross-validation.Alternatively, we can extract factor gene to capture spatial information and use factor genes for visualization and further analysis.

Code Vignette
We provide a small code block to show vignette of generating Figure 11 and 12, shown in code block 1. Listing 1: The basic wavelet-based dimensionality reduction workflow.kOverA_ST transforms the original spatial expression context into an image.This is processed by waveST.The final dimensionality reduction step is performed by decompose.
We have made a new package waveST containing the workflow we developed.The package is available at https: //github.com/OliverXUZY/waveST.The kOverA_ST function reduces data from an original [Weber, 2021] class input using the kOverA technique.Then we use waveST function to construct a S4 class waveST, containing input generated by Algorithm 1.This object-oriented approach stores properties of the original spatial experiment and simplifies downstream calls, like decomposition and visualization.We use the decompose function to apply all decomposition methods.In line 6, we apply the SVD to our original data, setting the number of factors to 5. In line 11, we apply the wavelet-based reduction and apply a manual threshold with τ = 40.We use plot and k=1 to visualize the first factor gene and matrix slides of the genes with the largest 5 loadings on that factor.

Conclusion
We have proposed a pipeline for dimensionality reduction that respects spatial structure.Both simulations and real data experiments demonstrate that wavelet and shrinkage techniques show positive results in spatially resolved transcriptomics data.We highlight the idea of combining image processing techniques and statistical methods for application in a spatial genomics context.One future direction is splitting genes into a groups suited or not for wavelet-based decomposition, and implementing decomposition with or without wavelet.Another direction is to focus input generation on only those genes that are thought to be spatially related.For genes not related to spatial information, we may perform regular decomposition on original data Y ∈ R N ×P , abandoning the spatial information of those genes.We expect this to improve reconstruction performance.In further analysis, it is worth considering other wavelet smoothing techniques and wavelet filters.The current methods incorporate little biological information.Bringing more domain knowledge will require further techniques, but is expected to yield better results.The input generation computes local average over even grids; however, it is possible to apply the wavelet method for irregularly spaced data [Nason, 2008].We hope wavelet methods will be useful in adapting existing methods for statistical genomics to the spatial setting.

Figure 2 :
Figure 2: The most dense gene has a structured expression pattern.

Figure 3 :
Figure 3: Left: The gene expression over scattered samples over locations.Right: The local average of gene expression over grid locations, used as input to our algorithms.

Figure 4 :
Figure 4: The gene expression pattern for nine factor genes in the simulation experiment.

Figure 5 :
Figure 5: The reconstruction error and gradient for estimated gene-by-location matrix with different SNR.The x-axis shows the rate of 1 SNR The left subplot shows the gradient changes.The gradient of gene expression is always lower with the wavelet technique, a consequence of its smoothing property.The right subplot shows the error -the wavelet method has lower error as the magnitude of noise increase.
Figure 6: Factor gene visualization.Figure (a) implements SVD on Z d , the visualization of column of Fraw .Figure (b) implements SVD on coefficient matrix, the visualization of F .

Figure 9 :
Figure9: The reconstruction error for each gene.The x-axis is the error from EBMF on raw data, the y-axis is the error from the combination of wavelet thresholding and EBMF.

Figure 10 :
Figure10: The sum of squares of the gradient for each gene.The y-axis is the index of each gene.Each bar's color shows whether that gene has better performance when using a wavelet basis.The M on the x-axis stands for "millions".
Figure 11: The top left figure is first factor gene, the following figure by row is genes with largest loadings on first factor gene: gene 712, 713, 716, 719, 715.
Figure 12: The top left figure is the second factor gene, the following figure by row is genes with largest loadings on second factor gene: gene 596, 289, 373, 712, 578.

Figure 14 :
Figure14: The reconstruction error for each gene.The x-axis is the error from SVD on raw data, the y-axis is the error from the combination of wavelet thresholding and SVD.
(a) Error per gene without wavelet transformation.(b) Error per gene with wavelet transformation.

Figure 15 :
Figure 15: The reconstruction error for each gene.
2: Partition interval x and interval y into D equal length interval, together get D 2 partitions over rectangle B. 3: while i ≤ P do Consider gene i expression over map B Wavelet Transformation and Shrinkage Require: Location by gene matrix Z ∈ R D 2 ×P , threshold method, optional threshold parameter τ .Apply threshold on wavelet coefficient if threshold method is specified 1: while i ≤ P do Consider gene i expression as matrix G i 2: Select i th g i column of gene matrix Z Stacking all {c i } P i=l together into matrix C ∈ R W ×P 9: return coefficient matrix C Column i of C store the coefficient of from gene i 7:Vectorize G i into a vector g i with length D 2 8: end while 9: Stacking all {g i } P i=l together into matrix Z ∈ R D 2 ×P 10: return matrix Z A transformed gene expression matrix Algorithm 2 2: Select i th c i column of gene matrix C 3: Form c i into coefficient list C i with number of coefficients W 4: Apply 2-D inverse wavelet transformation over C i , get post-wavelet expression matrix Ĝi 5: