# gpytorch.utils¶

## Utilities¶

gpytorch.utils.cached(method=None, name=None, ignore_args=False)[source]

A decorator allowing for specifying the name of a cache, allowing it to be modified elsewhere.

gpytorch.utils.contour_integral_quad(lazy_tensor, rhs, inverse=False, weights=None, shifts=None, max_lanczos_iter=20, num_contour_quadrature=None, shift_offset=0)[source]

Performs $$\mathbf K^{1/2} \mathbf b$$ or $$\mathbf K^{-1/2} \mathbf b$$ using contour integral quadrature.

Parameters: lazy_tensor (gpytorch.lazy.LazyTensor) – LazyTensor representing $$\mathbf K$$ rhs (torch.Tensor) – Right hand side tensor $$\mathbf b$$ inverse (bool) – (default False) whether to compute $$\mathbf K^{1/2} \mathbf b$$ (if False) or mathbf K^{-1/2} mathbf b (if True) max_lanczos_iter (int) – (default 10) Number of Lanczos iterations to run (to estimate eigenvalues) num_contour_quadrature (int) – How many quadrature samples to use for approximation. Default is in settings. torch.Tensor Approximation to $$\mathbf K^{1/2} \mathbf b$$ or $$\mathbf K^{-1/2} \mathbf b$$.
gpytorch.utils.linear_cg(matmul_closure, rhs, n_tridiag=0, tolerance=None, eps=1e-10, stop_updating_after=1e-10, max_iter=None, max_tridiag_iter=None, initial_guess=None, preconditioner=None)[source]

Implements the linear conjugate gradients method for (approximately) solving systems of the form

lhs result = rhs

for positive definite and symmetric matrices.

Parameters: matmul_closure - a function which performs a left matrix multiplication with lhs_mat (-) – rhs - the right-hand side of the equation (-) – n_tridiag - returns a tridiagonalization of the first n_tridiag columns of rhs (-) – tolerance - stop the solve when the (-) – eps - noise to add to prevent division by zero (-) – stop_updating_after - will stop updating a vector after this residual norm is reached (-) – max_iter - the maximum number of CG iterations (-) – max_tridiag_iter - the maximum size of the tridiagonalization matrix (-) – initial_guess - an initial guess at the solution result (-) – precondition_closure - a functions which left-preconditions a supplied vector (-) – result - a solution to the system (if n_tridiag is 0) result, tridiags - a solution to the system, and corresponding tridiagonal matrices (if n_tridiag > 0)
class gpytorch.utils.StochasticLQ(max_iter=15, num_random_probes=10)[source]

Implements an approximate log determinant calculation for symmetric positive definite matrices using stochastic Lanczos quadrature. For efficient calculation of derivatives, We additionally compute the trace of the inverse using the same probe vector the log determinant was computed with. For more details, see Dong et al. 2017 (in submission).

evaluate(matrix_shape, eigenvalues, eigenvectors, funcs)[source]

Computes tr(f(A)) for an arbitrary list of functions, where f(A) is equivalent to applying the function elementwise to the eigenvalues of A, i.e., if A = VLambdaV^{T}, then f(A) = Vf(Lambda)V^{T}, where f(Lambda) is applied elementwise. Note that calling this function with a list of functions to apply is significantly more efficient than calling it multiple times with one function – each additional function after the first requires negligible additional computation.

Parameters: matrix_shape (-) – eigenvalues (-) – eigenvectors (-) – funcs (-) – Each function in the closure should expect to take a torch vector of eigenvalues as input and apply the function elementwise. For example, to compute logdet(A) = tr(log(A)), [lambda x: x.log()] would be a reasonable value of funcs. results (list of scalars) - The trace of each supplied function applied to the matrix, e.g., [tr(f_1(A)),tr(f_2(A)),…,tr(f_k(A))].
gpytorch.utils.minres(matmul_closure, rhs, eps=1e-25, shifts=None, value=None, max_iter=None, preconditioner=None)[source]

Perform MINRES to find solutions to $$(\mathbf K + \alpha \sigma \mathbf I) \mathbf x = \mathbf b$$. Will find solutions for multiple shifts $$\sigma$$ at the same time.

Parameters: matmul_closure (callable) – Function to perform matmul with. rhs (torch.Tensor) – The vector $$\mathbf b$$ to solve against. shifts (torch.Tensor) – (default None) The shift $$\sigma$$ values. If set to None, then $$\sigma=0$$. value (float) – (default None) The multiplicative constant $$\alpha$$. If set to None, then $$\alpha=0$$. max_iter (int) – (default None) The maximum number of minres iterations. If set to None, then uses the constant stored in gpytorch.settings.max_cg_iterations. torch.Tensor The solves $$\mathbf x$$. The shape will correspond to the size of rhs and shifts.
gpytorch.utils.prod(items)[source]
gpytorch.utils.stable_pinverse(A: torch.Tensor) → torch.Tensor[source]

Compute a pseudoinverse of a matrix. Employs a stabilized QR decomposition.

gpytorch.utils.stable_qr(mat)[source]

performs a QR decomposition on the batched matrix mat. We need to use these functions because of

1. slow batched QR in pytorch (pytorch/pytorch#22573)
2. possible singularity in R
class gpytorch.utils.NNUtil(k, dim, batch_shape=torch.Size([]), preferred_nnlib='faiss', device='cpu')[source]

Utility for nearest neighbor search. It would first try to use faiss (requiring separate pacakge installment) as the backend for better computational performance. Otherwise, scikit-learn would be used as it is pre-installed with gpytorch.

Parameters: k (int) – number of nearest neighbors dim (int) – dimensionality of data batch_shape (torch.Size) – batch shape for train data preferred_nnlib (str) – currently supports faiss and scikit-learn (default: faiss). device (torch.device) – device that the NN search will be performed on.

Example

>>> train_x = torch.randn(10, 5)
>>> nn_util = NNUtil(k=3, dim=train_x.size(-1), device=train_x.device)
>>> nn_util.set_nn_idx(train_x)
>>> test_x = torch.randn(2, 5)
>>> test_nn_indices = nn_util.find_nn_idx(test_x) # finding 3 nearest neighbors for test_x
>>> test_nn_indices = nn_util.find_nn_idx(test_x, k=2) # finding 2 nearest neighbors for test_x
>>> sequential_nn_idx = nn_util.build_sequential_nn_idx(train_x) # build up sequential nearest neighbor
>>>     # structure for train_x

build_sequential_nn_idx(x)[source]

Build the sequential $$k$$ nearest neighbor structure within training data in the following way: for the $$i$$-th data point $$x_i$$, find its $$k$$ nearest neighbors among preceding training data $$x_1, \cdots, x_{i-1}$$, for i=k+1:N where N is the size of training data.

Parameters: x – training data. Shape (N, D) torch.LongTensor indices of nearest neighbors. Shape: (N-k, k)
find_nn_idx(test_x, k=None)[source]

Find $$k$$ nearest neighbors for test data test_x among the training data stored in this utility

Parameters: test_x – test data, shape (… x N x D) k (int) – number of nearest neighbors. Default is the value used in utility initialization. torch.LongTensor the indices of nearest neighbors in the training data
set_nn_idx(train_x)[source]

Set the indices of training data to facilitate nearest neighbor search. This function needs to be called every time that the data changes.

Parameters: train_x (torch.Tensor) – training data points (… x N x D)
to(device)[source]

Put the utility to a cpu or gpu device.

Parameters: device (torch.device) – Target device.

### Lanczos Utilities¶

gpytorch.utils.lanczos.lanczos_tridiag(matmul_closure, max_iter, dtype, device, matrix_shape, batch_shape=torch.Size([]), init_vecs=None, num_init_vecs=1, tol=1e-05)[source]
gpytorch.utils.lanczos.lanczos_tridiag_to_diag(t_mat)[source]

Given a num_init_vecs x num_batch x k x k tridiagonal matrix t_mat, returns a num_init_vecs x num_batch x k set of eigenvalues and a num_init_vecs x num_batch x k x k set of eigenvectors.

TODO: make the eigenvalue computations done in batch mode.

### Permutation Utilities¶

gpytorch.utils.permutation.apply_permutation(matrix: Union[LazyTensor, torch.Tensor], left_permutation: Optional[torch.Tensor] = None, right_permutation: Optional[torch.Tensor] = None)[source]

Applies a left and/or right (partial) permutation to a given matrix $$\mathbf K$$:

$$$\boldsymbol{\Pi}_\text{left} \mathbf K \boldsymbol{\Pi}_\text{right}^\top$$$

where the permutation matrices $$\boldsymbol{\Pi}_\text{left}$$ and $$\boldsymbol{\Pi}_\text{right}^\top$$ are represented by vectors left_permutation and right_permutation.

The permutation matrices may be partial permutations (only selecting a subset of rows/columns) or full permutations (permuting all rows/columns).

Importantly, if $$\mathbf K$$ is a batch of matrices, left_permutation and right_permutation can be a batch of permutation vectors, and this function will apply the appropriate permutation to each batch entry. Broadcasting rules apply.

Parameters: matrix (LazyTensor or Tensor (.. x n x n)) – $$\mathbf K$$ left_permutation (Tensor, optional (.. x <= n)) – vector representing $$\boldsymbol{\Pi}_\text{left}$$ right_permutation (Tensor, optional (.. x <= n)) – vector representing $$\boldsymbol{\Pi}_\text{right}$$ $$\boldsymbol{\Pi}_\text{left} \mathbf K \boldsymbol{\Pi}_\text{right}^\top$$ Tensor

Example

>>> _factor = torch.randn(2, 3, 5, 5)
>>> matrix = factor @ factor.transpose(-1, -2)  # 2 x 3 x 5 x 5
>>> left_permutation = torch.tensor([
>>>     [ 1, 3, 2, 4, 0 ],
>>>     [ 2, 1, 0, 3, 4 ],
>>>     [ 0, 1, 2, 4, 3 ],
>>> ])  # Full permutation: 2 x 3 x 5
>>> right_permutation = torch.tensor([
>>>     [ 1, 3, 2 ],
>>>     [ 2, 1, 0 ],
>>>     [ 0, 1, 2 ],
>>> ])  # Partial permutation: 2 x 3 x 3
>>> apply_permutation(matrix, left_permutation, right_permutation)  # 2 x 3 x 5 x 3

gpytorch.utils.permutation.inverse_permutation(permutation)[source]

Given a (batch of) permutation vector(s), return a permutation vector that inverts the original permutation.

Example

>>> permutation = torch.tensor([ 1, 3, 2, 4, 0 ])
>>> inverse_permutation(permutation)  # torch.tensor([ 4, 0, 2, 1, 3 ])


class gpytorch.utils.quadrature.GaussHermiteQuadrature1D(num_locs=None)[source]

Implements Gauss-Hermite quadrature for integrating a function with respect to several 1D Gaussian distributions in batch mode. Within GPyTorch, this is useful primarily for computing expected log likelihoods for variational inference.

This is implemented as a Module because Gauss-Hermite quadrature has a set of locations and weights that it should initialize one time, but that should obey parent calls to .cuda(), .double() etc.

forward(func, gaussian_dists)[source]

Runs Gauss-Hermite quadrature on the callable func, integrating against the Gaussian distributions specified by gaussian_dists.

Parameters: func (-) – Function to integrate gaussian_dists (-) – Either a MultivariateNormal whose covariance is assumed to be diagonal or a torch.distributions.Normal. Result of integrating func against each univariate Gaussian in gaussian_dists.

### Sparse Utilities¶

gpytorch.utils.sparse.bdsmm(sparse, dense)[source]

Batch dense-sparse matrix multiply

gpytorch.utils.sparse.make_sparse_from_indices_and_values(interp_indices, interp_values, num_rows)[source]

This produces a sparse tensor with a fixed number of non-zero entries in each column.

Parameters: interp_indices - Tensor (-) – A matrix which has the indices of the nonzero_entries for each column interp_values - Tensor (-) – The corresponding values num_rows - the number of rows in the result matrix (-) – SparseTensor - (batch_size) x num_cols x num_rows
gpytorch.utils.sparse.sparse_eye(size)[source]

Returns the identity matrix as a sparse matrix

gpytorch.utils.sparse.sparse_getitem(sparse, idxs)[source]
gpytorch.utils.sparse.sparse_repeat(sparse, *repeat_sizes)[source]
gpytorch.utils.sparse.to_sparse(dense)[source]

### Grid Utilities¶

class gpytorch.utils.grid.ScaleToBounds(lower_bound, upper_bound)[source]

Scale the input data so that it lies in between the lower and upper bounds.

In training (self.train()), this module adjusts the scaling factor to the minibatch of data. During evaluation (self.eval()), this module uses the scaling factor from the previous minibatch of data.

Parameters: lower_bound (float) – lower bound of scaled data upper_bound (float) – upper bound of scaled data

Example

>>> train_x = torch.randn(10, 5)
>>> module = gpytorch.utils.grid.ScaleToBounds(lower_bound=-1., upper_bound=1.)
>>>
>>> module.train()
>>> scaled_train_x = module(train_x)  # Data should be between -0.95 and 0.95
>>>
>>> module.eval()
>>> test_x = torch.randn(10, 5)
>>> scaled_test_x = module(test_x)  # Scaling is based on train_x

gpytorch.utils.grid.choose_grid_size(train_inputs, ratio=1.0, kronecker_structure=True)[source]

Given some training inputs, determine a good grid size for KISS-GP.

Parameters: x (torch.Tensor (.. x n x d)) – the input data ratio (float, optional) – Amount of grid points per data point (default: 1.) kronecker_structure (bool, optional) – Whether or not the model will use Kronecker structure in the grid (set to True unless there is an additive or product decomposition in the prior) Grid size int
gpytorch.utils.grid.create_data_from_grid(grid: List[torch.Tensor]) → torch.Tensor[source]
Parameters: grid (List[torch.Tensor]) – Each Tensor is a 1D set of increments for the grid in that dimension The set of points on the grid going by column-major order torch.Tensor
gpytorch.utils.grid.create_grid(grid_sizes: List[int], grid_bounds: List[Tuple[float, float]], extend: bool = True, device='cpu', dtype=torch.float32) → List[torch.Tensor][source]

Creates a grid represented by a list of 1D Tensors representing the projections of the grid into each dimension

If extend, we extend the grid by two points past the specified boundary which can be important for getting good grid interpolations.

Parameters: grid_sizes (List[Tuple[float, float]]) – Sizes of each grid dimension grid_bounds – Lower and upper bounds of each grid dimension device (torch.device, optional) – target device for output (default: cpu) dtype (torch.dtype, optional) – target dtype for output (default: torch.float) Grid points for each dimension. Grid points are stored in a torch.Tensor with shape grid_sizes[i]. List[torch.Tensor]
gpytorch.utils.grid.scale_to_bounds(x, lower_bound, upper_bound)[source]

DEPRECATRED: Use ScaleToBounds instead.

Parameters: x (torch.Tensor (.. x n x d)) – the input data lower_bound (float) – lower bound of scaled data upper_bound (float) – upper bound of scaled data scaled data torch.Tensor (.. x n x d)

### Nearest Neighbors Utilities¶

class gpytorch.utils.nearest_neighbors.NNUtil(k, dim, batch_shape=torch.Size([]), preferred_nnlib='faiss', device='cpu')[source]

Utility for nearest neighbor search. It would first try to use faiss (requiring separate pacakge installment) as the backend for better computational performance. Otherwise, scikit-learn would be used as it is pre-installed with gpytorch.

Parameters: k (int) – number of nearest neighbors dim (int) – dimensionality of data batch_shape (torch.Size) – batch shape for train data preferred_nnlib (str) – currently supports faiss and scikit-learn (default: faiss). device (torch.device) – device that the NN search will be performed on.

Example

>>> train_x = torch.randn(10, 5)
>>> nn_util = NNUtil(k=3, dim=train_x.size(-1), device=train_x.device)
>>> nn_util.set_nn_idx(train_x)
>>> test_x = torch.randn(2, 5)
>>> test_nn_indices = nn_util.find_nn_idx(test_x) # finding 3 nearest neighbors for test_x
>>> test_nn_indices = nn_util.find_nn_idx(test_x, k=2) # finding 2 nearest neighbors for test_x
>>> sequential_nn_idx = nn_util.build_sequential_nn_idx(train_x) # build up sequential nearest neighbor
>>>     # structure for train_x

build_sequential_nn_idx(x)[source]

Build the sequential $$k$$ nearest neighbor structure within training data in the following way: for the $$i$$-th data point $$x_i$$, find its $$k$$ nearest neighbors among preceding training data $$x_1, \cdots, x_{i-1}$$, for i=k+1:N where N is the size of training data.

Parameters: x – training data. Shape (N, D) torch.LongTensor indices of nearest neighbors. Shape: (N-k, k)
find_nn_idx(test_x, k=None)[source]

Find $$k$$ nearest neighbors for test data test_x among the training data stored in this utility

Parameters: test_x – test data, shape (… x N x D) k (int) – number of nearest neighbors. Default is the value used in utility initialization. torch.LongTensor the indices of nearest neighbors in the training data
set_nn_idx(train_x)[source]

Set the indices of training data to facilitate nearest neighbor search. This function needs to be called every time that the data changes.

Parameters: train_x (torch.Tensor) – training data points (… x N x D)
to(device)[source]

Put the utility to a cpu or gpu device.

Parameters: device (torch.device) – Target device.