GPax models
Gaussian Processes - Fully Bayesian Implementation
- class gpax.models.gp.ExactGP(input_dim, kernel, mean_fn=None, kernel_prior=None, mean_fn_prior=None, noise_prior=None, noise_prior_dist=None, lengthscale_prior_dist=None)[source]
Bases:
objectGaussian process class
- Parameters:
input_dim (
int) – Number of input dimensionskernel (
Union[str,Callable[[Array,Array,Dict[str,Array],Array],Array]]) – Kernel function (‘RBF’, ‘Matern’, ‘Periodic’, or custom function)mean_fn (
Optional[Callable[[Array,Dict[str,Array]],Array]]) – Optional deterministic mean function (use ‘mean_fn_priors’ to make it probabilistic)kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional custom priors over kernel hyperparameters. Use it when passing your custom kernel.mean_fn_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional priors over mean function parametersnoise_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over the observational noise variance. Defaults to LogNormal(0,1).lengthscale_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over kernel lengthscale. Defaults to LogNormal(0, 1).
Examples
Regular GP for sparse noisy obervations
>>> # Get random number generator keys for training and prediction >>> rng_key, rng_key_predict = gpax.utils.get_keys() >>> # Initialize model >>> gp_model = gpax.ExactGP(input_dim=1, kernel='Matern') >>> # Run HMC to obtain posterior samples for the GP model parameters >>> gp_model.fit(rng_key, X, y) # X and y are arrays with dimensions (n, 1) and (n,) >>> # Make a noiseless prediction on new inputs >>> y_pred, y_samples = gp_model.predict(rng_key_predict, X_new, noiseless=True)
GP with custom noise prior
>>> gp_model = gpax.ExactGP( >>> input_dim=1, kernel='RBF', >>> noise_prior_dist = numpyro.distributions.HalfNormal(.1) >>> ) >>> # Run HMC to obtain posterior samples for the GP model parameters >>> gp_model.fit(rng_key, X, y) # X and y are arrays with dimensions (n, 1) and (n,) >>> # Make a noiselsess prediction on new inputs >>> y_pred, y_samples = gp_model.predict(rng_key_predict, X_new, noiseless=True)
GP with custom probabilistic model as its mean function
>>> # Define a deterministic mean function >>> mean_fn = lambda x, param: param["a"]*x + param["b"] >>> >>> # Define priors over the mean function parameters (to make it probabilistic) >>> def mean_fn_prior(): >>> a = numpyro.sample("a", numpyro.distributions.Normal(3, 1)) >>> b = numpyro.sample("b", numpyro.distributions.Normal(0, 1)) >>> return {"a": a, "b": b} >>> >>> # Initialize structural GP model >>> sgp_model = gpax.ExactGP( input_dim=1, kernel='Matern', mean_fn=mean_fn, mean_fn_prior=mean_fn_prior) >>> # Run HMC to obtain posterior samples for the GP model parameters >>> sgp_model.fit(rng_key, X, y) # X and y are numpy arrays with dimensions (n, d) and (n,) >>> # Make a noiselsess prediction on new inputs >>> y_pred, y_samples = gp_model.predict(rng_key_predict, X_new, noiseless=True)
- model(X, y=None, **kwargs)[source]
GP probabilistic model with inputs X and targets y
- Return type:
None
- fit(rng_key, X, y, num_warmup=2000, num_samples=2000, num_chains=1, chain_method='sequential', progress_bar=True, print_summary=True, device=None, **kwargs)[source]
Run Hamiltonian Monter Carlo to infer the GP parameters
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – 2D feature vectory (
Array) – 1D target vectornum_warmup (
int) – number of HMC warmup statesnum_samples (
int) – number of HMC samplesnum_chains (
int) – number of HMC chainschain_method (
str) – ‘sequential’, ‘parallel’ or ‘vectorized’progress_bar (
bool) – show progress barprint_summary (
bool) – print summary at the end of samplingdevice (
Type[Device]) – optionally specify a cpu or gpu device on which to run the inference; e.g.,device=jax.devices("cpu")[0]**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
None
- get_samples(chain_dim=False)[source]
Get posterior samples (after running the MCMC chains)
- Return type:
Dict[str,Array]
- get_mvn_posterior(X_new, params, noiseless=False, **kwargs)[source]
Returns parameters (mean and cov) of multivariate normal posterior for a single sample of GP parameters
- Return type:
Tuple[Array,Array]
- predict_in_batches(rng_key, X_new, batch_size=100, samples=None, n=1, filter_nans=False, predict_fn=None, noiseless=False, device=None, **kwargs)[source]
Make prediction at X_new with sampled GP parameters by spitting the input array into chunks (“batches”) and running predict_fn (defaults to self.predict) on each of them one-by-one to avoid a memory overflow
- Return type:
Tuple[Array,Array]
- predict(rng_key, X_new, samples=None, n=1, filter_nans=False, noiseless=False, device=None, **kwargs)[source]
Make prediction at X_new points using posterior samples for GP parameters
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – new inputs with (number of points, number of features) dimensionssamples (
Optional[Dict[str,Array]]) – optional (different) samples with GP parametersn (
int) – number of samples from Multivariate Normal posterior for each HMC sample with GP parametersfilter_nans (
bool) – filter out samples containing NaN values (if any)noiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise by default for the training data, we also want to include that noise in our prediction.device (
Type[Device]) – optionally specify a cpu or gpu device on which to make a prediction; e.g.,`device=jax.devices("gpu")[0]`**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
Tuple[Array,Array]
- Returns
Center of the mass of sampled means and all the sampled predictions
- class gpax.models.uigp.UIGP(input_dim, kernel, mean_fn=None, kernel_prior=None, mean_fn_prior=None, noise_prior=None, noise_prior_dist=None, lengthscale_prior_dist=None, sigma_x_prior_dist=None)[source]
Bases:
ExactGPGaussian process with uncertain inputs
This class extends the standard Gaussian Process model to handle uncertain inputs. It allows for incorporating the uncertainty in input data into the GP model, providing a more robust prediction.
- Parameters:
input_dim (
int) – Number of input dimensionskernel (
Union[str,Callable[[Array,Array,Dict[str,Array],Array],Array]]) – Kernel function (‘RBF’, ‘Matern’, ‘Periodic’, or custom function)mean_fn (
Optional[Callable[[Array,Dict[str,Array]],Array]]) – Optional deterministic mean function (use ‘mean_fn_priors’ to make it probabilistic)kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional custom priors over kernel hyperparameters. Use it when passing your custom kernel.mean_fn_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional priors over mean function parametersnoise_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over the observational noise variance. Defaults to LogNormal(0,1).lengthscale_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over kernel lengthscale. Defaults to LogNormal(0, 1).sigma_x_prior_dist (
Optional[Distribution]) – Optional custom prior for the input uncertainty (sigma_x). Defaults to HalfNormal(0.1) under the assumption that data is normalized to (0, 1).
Examples
UIGP with custom prior over sigma_x
>>> # Get random number generator keys for training and prediction >>> rng_key, rng_key_predict = gpax.utils.get_keys() >>> # Initialize model >>> gp_model = gpax.UIGP(input_dim=1, kernel='Matern', sigma_x_prior_dist=gpax.utils.halfnormal_dist(0.5)) >>> # Run HMC to obtain posterior samples for the model parameters >>> gp_model.fit(rng_key, X, y, num_warmup=2000, num_samples=10000) >>> # Make a prediction on new inputs >>> y_pred, y_samples = gp_model.predict(rng_key_predict, X_new)
- model(X, y=None, **kwargs)[source]
Gaussian process model for uncertain (stochastic) inputs
- Return type:
None
- get_mvn_posterior(X_new, params, noiseless=False, **kwargs)[source]
Returns parameters (mean and cov) of multivariate normal posterior for a single sample of UIGP parameters
- Return type:
Tuple[Array,Array]
- fit(rng_key, X, y, num_warmup=2000, num_samples=2000, num_chains=1, chain_method='sequential', progress_bar=True, print_summary=True, device=None, **kwargs)
Run Hamiltonian Monter Carlo to infer the GP parameters
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – 2D feature vectory (
Array) – 1D target vectornum_warmup (
int) – number of HMC warmup statesnum_samples (
int) – number of HMC samplesnum_chains (
int) – number of HMC chainschain_method (
str) – ‘sequential’, ‘parallel’ or ‘vectorized’progress_bar (
bool) – show progress barprint_summary (
bool) – print summary at the end of samplingdevice (
Type[Device]) – optionally specify a cpu or gpu device on which to run the inference; e.g.,device=jax.devices("cpu")[0]**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
None
- get_samples(chain_dim=False)
Get posterior samples (after running the MCMC chains)
- Return type:
Dict[str,Array]
- predict(rng_key, X_new, samples=None, n=1, filter_nans=False, noiseless=False, device=None, **kwargs)
Make prediction at X_new points using posterior samples for GP parameters
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – new inputs with (number of points, number of features) dimensionssamples (
Optional[Dict[str,Array]]) – optional (different) samples with GP parametersn (
int) – number of samples from Multivariate Normal posterior for each HMC sample with GP parametersfilter_nans (
bool) – filter out samples containing NaN values (if any)noiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise by default for the training data, we also want to include that noise in our prediction.device (
Type[Device]) – optionally specify a cpu or gpu device on which to make a prediction; e.g.,`device=jax.devices("gpu")[0]`**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
Tuple[Array,Array]
- Returns
Center of the mass of sampled means and all the sampled predictions
- predict_in_batches(rng_key, X_new, batch_size=100, samples=None, n=1, filter_nans=False, predict_fn=None, noiseless=False, device=None, **kwargs)
Make prediction at X_new with sampled GP parameters by spitting the input array into chunks (“batches”) and running predict_fn (defaults to self.predict) on each of them one-by-one to avoid a memory overflow
- Return type:
Tuple[Array,Array]
- sample_from_prior(rng_key, X, num_samples=10)
Samples from prior predictive distribution at X
- class gpax.models.hskgp.VarNoiseGP(input_dim, kernel, noise_kernel='RBF', mean_fn=None, kernel_prior=None, mean_fn_prior=None, noise_kernel_prior=None, lengthscale_prior_dist=None, noise_mean_fn=None, noise_mean_fn_prior=None, noise_lengthscale_prior_dist=None)[source]
Bases:
ExactGPHeteroskedastic Gaussian process class
- Parameters:
input_dim (
int) – Number of input dimensionskernel (
Union[str,Callable[[Array,Array,Dict[str,Array],Array],Array]]) – Main kernel function (‘RBF’, ‘Matern’, ‘Periodic’, or custom function)noise_kernel (
Union[str,Callable[[Array,Array,Dict[str,Array],Array],Array]]) – Noise kernel function (‘RBF’, ‘Matern’, ‘Periodic’, or custom function)mean_fn (
Optional[Callable[[Array,Dict[str,Array]],Array]]) – Optional deterministic mean function (use ‘mean_fn_priors’ to make it probabilistic)kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional custom priors over main kernel hyperparameters. Use it when passing your custom kernel.mean_fn_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional priors over mean function parametersnoise_kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional custom priors over noise kernel hyperparameters. Use it when passing your custom kernel.lengthscale_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over main kernel lengthscale. Defaults to LogNormal(0, 1).noise_mean_fn (
Optional[Callable[[Array,Dict[str,Array]],Array]]) – Optional noise mean functionnoise_mean_fn_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional priors over noise mean functionnoise_lengthscale_prior_dist (
Optional[Distribution]) –Optional custom prior distribution over noise kernel lengthscale. Defaults to LogNormal(0, 1).
Examples:
Use two different kernels with default priors for main and noise processes
>>> # Get random number generator keys for training and prediction >>> rng_key, rng_key_predict = gpax.utils.get_keys() >>> # Initialize model >>> gp_model = gpax.VarNoiseGP(input_dim=1, kernel='RBF, noise_kernel='Matern') >>> # Run HMC to obtain posterior samples for the GP model parameters >>> gp_model.fit(rng_key, X, y) >>> # Make a prediction on new inputs >>> y_pred, y_samples = gp_model.predict(rng_key_predict, X_new) >>> # Get the inferred noise samples (for training data) >>> data_variance = gp_model.get_data_var_samples()
Specify custom kernel lengthscale priors for main and noise kernels
>>> lscale_prior = gpax.utils.gamma_dist(5, 1) # equivalent to numpyro.distributions.Gamma(5, 1) >>> noise_lscale_prior = gpax.utils.halfnormal_dist(1) # equivalent to numpyro.distributions.HalfNormal(1) >>> # Initialize model >>> gp_model = gpax.VarNoiseGP( >>> input_dim=1, kernel='RBF, noise_kernel='Matern', >>> lengthscale_prior_dist=lscale_prior, noise_lengthscale_prior_dist=noise_lscale_prior) >>> # Run HMC to obtain posterior samples for the GP model parameters >>> gp_model.fit(rng_key, X, y) >>> # Make a prediction on new inputs >>> y_pred, y_samples = gp_model.predict(rng_key_predict, X_new) >>> # Get the inferred noise samples (for training data) >>> data_variance = gp_model.get_data_var_samples()
- model(X, y=None, **kwargs)[source]
Heteroskedastic GP probabilistic model with inputs X and targets y
- Return type:
None
- get_mvn_posterior(X_new, params, *args, **kwargs)[source]
Returns parameters (mean and cov) of multivariate normal posterior for a single sample of heteroskedastic GP parameters
- Return type:
Tuple[Array,Array]
- fit(rng_key, X, y, num_warmup=2000, num_samples=2000, num_chains=1, chain_method='sequential', progress_bar=True, print_summary=True, device=None, **kwargs)
Run Hamiltonian Monter Carlo to infer the GP parameters
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – 2D feature vectory (
Array) – 1D target vectornum_warmup (
int) – number of HMC warmup statesnum_samples (
int) – number of HMC samplesnum_chains (
int) – number of HMC chainschain_method (
str) – ‘sequential’, ‘parallel’ or ‘vectorized’progress_bar (
bool) – show progress barprint_summary (
bool) – print summary at the end of samplingdevice (
Type[Device]) – optionally specify a cpu or gpu device on which to run the inference; e.g.,device=jax.devices("cpu")[0]**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
None
- get_samples(chain_dim=False)
Get posterior samples (after running the MCMC chains)
- Return type:
Dict[str,Array]
- predict(rng_key, X_new, samples=None, n=1, filter_nans=False, noiseless=False, device=None, **kwargs)
Make prediction at X_new points using posterior samples for GP parameters
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – new inputs with (number of points, number of features) dimensionssamples (
Optional[Dict[str,Array]]) – optional (different) samples with GP parametersn (
int) – number of samples from Multivariate Normal posterior for each HMC sample with GP parametersfilter_nans (
bool) – filter out samples containing NaN values (if any)noiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise by default for the training data, we also want to include that noise in our prediction.device (
Type[Device]) – optionally specify a cpu or gpu device on which to make a prediction; e.g.,`device=jax.devices("gpu")[0]`**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
Tuple[Array,Array]
- Returns
Center of the mass of sampled means and all the sampled predictions
- predict_in_batches(rng_key, X_new, batch_size=100, samples=None, n=1, filter_nans=False, predict_fn=None, noiseless=False, device=None, **kwargs)
Make prediction at X_new with sampled GP parameters by spitting the input array into chunks (“batches”) and running predict_fn (defaults to self.predict) on each of them one-by-one to avoid a memory overflow
- Return type:
Tuple[Array,Array]
- sample_from_prior(rng_key, X, num_samples=10)
Samples from prior predictive distribution at X
- class gpax.models.mngp.MeasuredNoiseGP(input_dim, kernel, mean_fn=None, kernel_prior=None, mean_fn_prior=None, lengthscale_prior_dist=None)[source]
Bases:
ExactGPGaussian Process model that incorporates measured noise. This class extends the ExactGP model by allowing the inclusion of measured noise variances in the GP framework. Unlike standard GP models where noise is typically inferred, this model uses noise values obtained from repeated measurements at the same input points.
- Parameters:
input_dim (
int) – Number of input dimensionskernel (
Union[str,Callable[[Array,Array,Dict[str,Array],Array],Array]]) – Kernel function (‘RBF’, ‘Matern’, ‘Periodic’, or custom function)mean_fn (
Optional[Callable[[Array,Dict[str,Array]],Array]]) – Optional deterministic mean function (use ‘mean_fn_priors’ to make it probabilistic)kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional custom priors over kernel hyperparameters. Use it when passing your custom kernel.mean_fn_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional priors over mean function parameterslengthscale_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over kernel lengthscale. Defaults to LogNormal(0, 1).
Examples
>>> # Get random number generator keys for training and prediction >>> key1, key2 = gpax.utils.get_keys() >>> # Initialize model >>> gp_model = gpax.MeasuredNoiseGP(input_dim=1, kernel='Matern') >>> # Run HMC to obtain posterior samples for the GP model parameters >>> gp_model.fit(key1, X, y_mean, noise) # X, y_mean, and noise have dimensions (n, 1), (n,), and (n,) >>> # Make a prediction on new inputs by extrapolating noise variance with either linear regression or gaussian process >>> y_pred, y_samples = gp_model.predict(key2, X_new, noise_prediction_method='linreg')
- model(X, y=None, measured_noise=None, **kwargs)[source]
GP model that accepts measured noise
- Return type:
None
- fit(rng_key, X, y, measured_noise, num_warmup=2000, num_samples=2000, num_chains=1, chain_method='sequential', progress_bar=True, print_summary=True, device=None, **kwargs)[source]
Run Hamiltonian Monter Carlo to infer the GP parameters
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – 2D feature vectory (
Array) – 1D target vectormeasured_noise (
Array) – 1D vector with measured noisenum_warmup (
int) – number of HMC warmup statesnum_samples (
int) – number of HMC samplesnum_chains (
int) – number of HMC chainschain_method (
str) – ‘sequential’, ‘parallel’ or ‘vectorized’progress_bar (
bool) – show progress barprint_summary (
bool) – print summary at the end of samplingdevice (
Type[Device]) – optionally specify a cpu or gpu device on which to run the inference; e.g.,device=jax.devices("cpu")[0]**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
None
- predict(rng_key, X_new, samples=None, n=1, filter_nans=False, noiseless=True, device=None, noise_prediction_method='linreg', **kwargs)[source]
Make prediction at X_new points using posterior samples for GP parameters
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – new inputs with (number of points, number of features) dimensionssamples (
Optional[Dict[str,Array]]) – optional (different) samples with GP parametersn (
int) – number of samples from Multivariate Normal posterior for each HMC sample with GP parametersfilter_nans (
bool) – filter out samples containing NaN values (if any)noiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise by default for the training data, we also want to include that noise in our prediction.device (
Type[Device]) – optionally specify a cpu or gpu device on which to make a prediction; e.g.,`device=jax.devices("gpu")[0]`noise_prediction_method (
str) – Method for extrapolating noise variance to new/test data. Choose between ‘linreg’ and ‘gpreg’. Defaults to ‘linreg’.**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
Tuple[Array,Array]
- Returns
Center of the mass of sampled means and all the sampled predictions
- get_mvn_posterior(X_new, params, noiseless=False, **kwargs)
Returns parameters (mean and cov) of multivariate normal posterior for a single sample of GP parameters
- Return type:
Tuple[Array,Array]
- get_samples(chain_dim=False)
Get posterior samples (after running the MCMC chains)
- Return type:
Dict[str,Array]
- predict_in_batches(rng_key, X_new, batch_size=100, samples=None, n=1, filter_nans=False, predict_fn=None, noiseless=False, device=None, **kwargs)
Make prediction at X_new with sampled GP parameters by spitting the input array into chunks (“batches”) and running predict_fn (defaults to self.predict) on each of them one-by-one to avoid a memory overflow
- Return type:
Tuple[Array,Array]
- sample_from_prior(rng_key, X, num_samples=10)
Samples from prior predictive distribution at X
- class gpax.models.vgp.vExactGP(input_dim, kernel, mean_fn=None, kernel_prior=None, mean_fn_prior=None, noise_prior=None, noise_prior_dist=None, lengthscale_prior_dist=None)[source]
Bases:
ExactGPGaussian process class for vector-valued targets
- Parameters:
input_dim (
int) – number of input dimensionskernel (
str) – type of kernel (‘RBF’, ‘Matern’, ‘Periodic’)mean_fn (
Optional[Callable[[Array,Dict[str,Array]],Array]]) – optional deterministic mean function (use ‘mean_fn_priors’ to make it probabilistic)kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – optional custom priors over kernel hyperparameters (uses LogNormal(0,1) by default)mean_fn_prior (
Optional[Callable[[],Dict[str,Array]]]) – optional priors over mean function parametersnoise_prior (
Optional[Callable[[],Dict[str,Array]]]) – optional custom prior for observation noisenoise_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over the observational noise variance. Defaults to LogNormal(0,1).lengthscale_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over kernel lengthscale. Defaults to LogNormal(0, 1).
- model(X, y=None, **kwargs)[source]
GP probabilistic model with inputs X and vector-valued targets y
- Return type:
None
- get_mvn_posterior(X_new, params, noiseless=False, **kwargs)[source]
Returns parameters (mean and cov) of multivariate normal posterior for a single sample of GP parameters. Wrapper over self._get_mvn_posterior.
- Return type:
Tuple[Array,Array]
- predict_in_batches(rng_key, X_new, batch_size=100, samples=None, n=1, filter_nans=False, predict_fn=None, noiseless=False, device=None, **kwargs)[source]
Make prediction at X_new with sampled GP parameters by spitting the input array into chunks (“batches”) and running predict_fn (defaults to self.predict) on each of them one-by-one to avoid a memory overflow
- Return type:
Tuple[Array,Array]
- fit(rng_key, X, y, num_warmup=2000, num_samples=2000, num_chains=1, chain_method='sequential', progress_bar=True, print_summary=True, device=None, **kwargs)
Run Hamiltonian Monter Carlo to infer the GP parameters
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – 2D feature vectory (
Array) – 1D target vectornum_warmup (
int) – number of HMC warmup statesnum_samples (
int) – number of HMC samplesnum_chains (
int) – number of HMC chainschain_method (
str) – ‘sequential’, ‘parallel’ or ‘vectorized’progress_bar (
bool) – show progress barprint_summary (
bool) – print summary at the end of samplingdevice (
Type[Device]) – optionally specify a cpu or gpu device on which to run the inference; e.g.,device=jax.devices("cpu")[0]**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
None
- get_samples(chain_dim=False)
Get posterior samples (after running the MCMC chains)
- Return type:
Dict[str,Array]
- predict(rng_key, X_new, samples=None, n=1, filter_nans=False, noiseless=False, device=None, **kwargs)
Make prediction at X_new points using posterior samples for GP parameters
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – new inputs with (number of points, number of features) dimensionssamples (
Optional[Dict[str,Array]]) – optional (different) samples with GP parametersn (
int) – number of samples from Multivariate Normal posterior for each HMC sample with GP parametersfilter_nans (
bool) – filter out samples containing NaN values (if any)noiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise by default for the training data, we also want to include that noise in our prediction.device (
Type[Device]) – optionally specify a cpu or gpu device on which to make a prediction; e.g.,`device=jax.devices("gpu")[0]`**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
Tuple[Array,Array]
- Returns
Center of the mass of sampled means and all the sampled predictions
- sample_from_prior(rng_key, X, num_samples=10)
Samples from prior predictive distribution at X
Gaussian Processes - Approximate Bayesian
- class gpax.models.vigp.viGP(input_dim, kernel, mean_fn=None, kernel_prior=None, mean_fn_prior=None, noise_prior=None, noise_prior_dist=None, lengthscale_prior_dist=None, guide='delta')[source]
Bases:
ExactGPVariational inference based Gaussian process
- Parameters:
input_dim (
int) – Number of input dimensionskernel (
str) – Kernel function (‘RBF’, ‘Matern’, ‘Periodic’, or custom function)mean_fn (
Optional[Callable[[Array,Dict[str,Array]],Array]]) – Optional deterministic mean function (use ‘mean_fn_priors’ to make it probabilistic)kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional custom priors over kernel hyperparameters; uses LogNormal(0,1) by defaultmean_fn_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional priors over mean function parametersnoise_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over the observational noise variance. Defaults to LogNormal(0,1).lengthscale_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over kernel lengthscale. Defaults to LogNormal(0, 1).guide (
str) – Auto-guide option, use ‘delta’ (default) or ‘normal’
Examples
Use viGP to reconstruct data from sparse noisy obervations
>>> # Get random number generator keys >>> rng_key, rng_key_predict = gpax.utils.get_keys() >>> # Initialize model >>> gp_model = gpax.viGP(input_dim=1, kernel='Matern') >>> # Run variational inference to obtain a MAP estimate for the GP model parameters >>> gp_model.fit(rng_key, X, y, num_steps=1000) # X and y are arrays with dimensions (n, 1) and (n,) >>> # Make a noiseless prediction on new inputs >>> y_pred, y_samples = gp_model.predict(rng_key_predict, X_new, noiseless=True)
- fit(rng_key, X, y, num_steps=1000, step_size=0.005, progress_bar=True, print_summary=True, device=None, **kwargs)[source]
Run variational inference to learn GP (hyper)parameters
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – 2D feature vector with (number of points, number of features) dimensionsy (
Array) – 1D target vector with (n,) dimensionsnum_steps (
int) – number of SVI stepsstep_size (
float) – step size schedule for Adam optimizerprogress_bar (
bool) – show progress barprint_summary (
bool) – print summary at the end of trainingdevice (
Type[Device]) – optionally specify a cpu or gpu device on which to run the inference; e.g.,device=jax.devices("cpu")[0]**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
None
- predict_in_batches(rng_key, X_new, batch_size=100, samples=None, predict_fn=None, noiseless=False, device=None, **kwargs)[source]
Make prediction at X_new with sampled GP parameters by spitting the input array into chunks (“batches”) and running predict_fn (defaults to self.predict) on each of them one-by-one to avoid a memory overflow
- Return type:
Tuple[Array,Array]
- predict(rng_key, X_new, samples=None, noiseless=False, device=None, **kwargs)[source]
Make prediction at X_new points using posterior samples for GP parameters
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – new inputs with (number of points, number of features) dimensionsnoiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise by default for the training data, we also want to include that noise in our prediction.device (
Type[Device]) – optionally specify a cpu or gpu device on which to make a prediction; e.g.,`device=jax.devices("gpu")[0]`**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
Tuple[Array,Array]
- Returns
Center of the mass of sampled means and all the sampled predictions
- get_mvn_posterior(X_new, params, noiseless=False, **kwargs)
Returns parameters (mean and cov) of multivariate normal posterior for a single sample of GP parameters
- Return type:
Tuple[Array,Array]
- model(X, y=None, **kwargs)
GP probabilistic model with inputs X and targets y
- Return type:
None
- sample_from_prior(rng_key, X, num_samples=10)
Samples from prior predictive distribution at X
- class gpax.models.sparse_gp.viSparseGP(input_dim, kernel, mean_fn=None, kernel_prior=None, mean_fn_prior=None, noise_prior=None, noise_prior_dist=None, lengthscale_prior_dist=None, guide='delta')[source]
Bases:
viGPVariational inference-based sparse Gaussian process
- Parameters:
input_dim (
int) – Number of input dimensionskernel (
str) – Kernel function (‘RBF’, ‘Matern’, ‘Periodic’, or custom function)mean_fn (
Optional[Callable[[Array,Dict[str,Array]],Array]]) – Optional deterministic mean function (use ‘mean_fn_priors’ to make it probabilistic)kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional custom priors over kernel hyperparameters; uses LogNormal(0,1) by defaultmean_fn_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional priors over mean function parametersnoise_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over the observational noise variance. Defaults to LogNormal(0,1).lengthscale_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over kernel lengthscale. Defaults to LogNormal(0, 1).guide (
str) – Auto-guide option, use ‘delta’ (default) or ‘normal’
- model(X, y=None, Xu=None, **kwargs)[source]
Probabilistic sparse Gaussian process regression model
- Return type:
None
- fit(rng_key, X, y, inducing_points_ratio=0.1, inducing_points_selection='random', num_steps=1000, step_size=0.005, progress_bar=True, print_summary=True, device=None, **kwargs)[source]
Run variational inference to learn sparse GP (hyper)parameters
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – 2D feature vector with (number of points, number of features) dimensionsy (
Array) – 1D target vector with (n,) dimensionsXu – Inducing points ratio. Must be a float between 0 and 1. Default value is 0.1.
num_steps (
int) – number of SVI stepsstep_size (
float) – step size schedule for Adam optimizerprogress_bar (
bool) – show progress barprint_summary (
bool) – print summary at the end of trainingdevice (
Type[Device]) – optionally specify a cpu or gpu device on which to run the inference; e.g.,device=jax.devices("cpu")[0]**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
None
- get_mvn_posterior(X_new, params, noiseless=False, **kwargs)[source]
Returns parameters (mean and cov) of multivariate normal posterior for a single sample of GP parameters
- Return type:
Tuple[Array,Array]
- get_samples()
Get posterior samples
- Return type:
Dict[str,Array]
- predict(rng_key, X_new, samples=None, noiseless=False, device=None, **kwargs)
Make prediction at X_new points using posterior samples for GP parameters
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – new inputs with (number of points, number of features) dimensionsnoiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise by default for the training data, we also want to include that noise in our prediction.device (
Type[Device]) – optionally specify a cpu or gpu device on which to make a prediction; e.g.,`device=jax.devices("gpu")[0]`**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
Tuple[Array,Array]
- Returns
Center of the mass of sampled means and all the sampled predictions
- predict_in_batches(rng_key, X_new, batch_size=100, samples=None, predict_fn=None, noiseless=False, device=None, **kwargs)
Make prediction at X_new with sampled GP parameters by spitting the input array into chunks (“batches”) and running predict_fn (defaults to self.predict) on each of them one-by-one to avoid a memory overflow
- Return type:
Tuple[Array,Array]
- sample_from_prior(rng_key, X, num_samples=10)
Samples from prior predictive distribution at X
Deep Kernel Learning - Fully Bayesian Implementation
- class gpax.models.dkl.DKL(input_dim, z_dim=2, kernel='RBF', kernel_prior=None, nn=None, nn_prior=None, latent_prior=None, hidden_dim=None, **kwargs)[source]
Bases:
ExactGPFully Bayesian implementation of deep kernel learning
- Parameters:
input_dim (
int) – Number of input dimensionsz_dim (
int) – Latent space dimensionality (defaults to 2)kernel (
str) – Kernel function (‘RBF’, ‘Matern’, ‘Periodic’, or custom function)kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional priors over kernel hyperparameters; uses LogNormal(0,1) by defaultnn (
Optional[Callable[[Array,Dict[str,Array]],Array]]) – Custom neural network (‘feature extractor’); uses a 3-layer MLP with hyperbolic tangent activations by defaultnn_prior (
Optional[Callable[[],Dict[str,Array]]]) – Priors over the weights and biases in ‘nn’; uses normal priors by defaultlatent_prior (
Optional[Callable[[Array],Dict[str,Array]]]) – Optional prior over the latent space (BNN embedding); uses none by defaulthidden_dim (
Optional[List[int]]) – Optional custom MLP architecture. For example [16, 8, 4] corresponds to a 3-layer neural network backbone containing 16, 8, and 4 neurons activated by tanh(). The latent layer is added autoamtically and doesn’t have to be spcified here. Defaults to [64, 32].**kwargs – Optional custom prior distributions over observational noise (noise_dist_prior) and kernel lengthscale (lengthscale_prior_dist)
Examples
DKL with image patches as inputs and a 1-d vector as targets
>>> # Get random number generator keys for training and prediction >>> key1, key2 = gpax.utils.get_keys() >>> input data dimensions are (n, height*width*channels) >>> data_dim = X.shape[-1] >>> # Initialize DKL model with 2 latent dimensions >>> dkl = gpax.DKL(data_dim, z_dim=2, kernel='RBF') >>> # Train model by parallelizing HMC chains on a single GPU >>> dkl.fit(key1, X, y, num_warmup=333, num_samples=333, num_chains=3, chain_method='vectorized') >>> # Obtain posterior mean and samples from DKL posterior at new inputs >>> # using batches to avoid memory overflow >>> y_pred, y_samples = dkl.predict_in_batches(key2, X_new)
- get_mvn_posterior(X_new, params, noiseless=False, **kwargs)[source]
Returns parameters (mean and cov) of multivariate normal posterior for a single sample of GP parameters
- Return type:
Tuple[Array,Array]
- embed(X_new)[source]
Embeds data into the latent space using the inferred weights of the DKL’s Bayesian neural network
- Return type:
Array
- fit(rng_key, X, y, num_warmup=2000, num_samples=2000, num_chains=1, chain_method='sequential', progress_bar=True, print_summary=True, device=None, **kwargs)
Run Hamiltonian Monter Carlo to infer the GP parameters
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – 2D feature vectory (
Array) – 1D target vectornum_warmup (
int) – number of HMC warmup statesnum_samples (
int) – number of HMC samplesnum_chains (
int) – number of HMC chainschain_method (
str) – ‘sequential’, ‘parallel’ or ‘vectorized’progress_bar (
bool) – show progress barprint_summary (
bool) – print summary at the end of samplingdevice (
Type[Device]) – optionally specify a cpu or gpu device on which to run the inference; e.g.,device=jax.devices("cpu")[0]**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
None
- get_samples(chain_dim=False)
Get posterior samples (after running the MCMC chains)
- Return type:
Dict[str,Array]
- predict(rng_key, X_new, samples=None, n=1, filter_nans=False, noiseless=False, device=None, **kwargs)
Make prediction at X_new points using posterior samples for GP parameters
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – new inputs with (number of points, number of features) dimensionssamples (
Optional[Dict[str,Array]]) – optional (different) samples with GP parametersn (
int) – number of samples from Multivariate Normal posterior for each HMC sample with GP parametersfilter_nans (
bool) – filter out samples containing NaN values (if any)noiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise by default for the training data, we also want to include that noise in our prediction.device (
Type[Device]) – optionally specify a cpu or gpu device on which to make a prediction; e.g.,`device=jax.devices("gpu")[0]`**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
Tuple[Array,Array]
- Returns
Center of the mass of sampled means and all the sampled predictions
- predict_in_batches(rng_key, X_new, batch_size=100, samples=None, n=1, filter_nans=False, predict_fn=None, noiseless=False, device=None, **kwargs)
Make prediction at X_new with sampled GP parameters by spitting the input array into chunks (“batches”) and running predict_fn (defaults to self.predict) on each of them one-by-one to avoid a memory overflow
- Return type:
Tuple[Array,Array]
- sample_from_prior(rng_key, X, num_samples=10)
Samples from prior predictive distribution at X
Deep Kernel Learning - Approximate Bayesian
- class gpax.models.vidkl.viDKL(input_dim, z_dim=2, kernel='RBF', kernel_prior=None, nn=None, nn_prior=True, latent_prior=None, guide='delta', **kwargs)[source]
Bases:
ExactGPImplementation of the variational infernece-based deep kernel learning
- Parameters:
input_dim (
Union[int,Tuple[int]]) – Input features dimensions (e.g. 64*64 for a stack of flattened 64-by-64 images)z_dim (
int) – Latent space dimensionality (defaults to 2)kernel (
str) – Kernel function (‘RBF’, ‘Matern’, ‘Periodic’, or custom function)kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional priors over kernel hyperparameters; uses LogNormal(0,1) by defaultnn (
Optional[Callable[[Array],Array]]) – Custom neural network (‘feature extractor’); uses a 3-layer MLP with ReLU activations by defaultnn_prior (
bool) – Places probabilistic priors over NN weights and biases (Default: True)latent_prior (
Optional[Callable[[Array],Dict[str,Array]]]) – Optional prior over the latent space (NN embedding); uses none by defaultguide (
str) – Auto-guide option, use ‘delta’ (default) or ‘normal’**kwargs – Optional custom prior distributions over observational noise (noise_dist_prior) and kernel lengthscale (lengthscale_prior_dist)
Examples
vi-DKL with image patches as inputs and a 1-d vector as targets
>>> # Get random number generator keys for training and prediction >>> key1, key2 = gpax.utils.get_keys() >>> input data dimensions are (n, height*width*channels) >>> data_dim = X.shape[-1] >>> # Initialize vi-DKL model with 2 latent dimensions >>> dkl = gpax.viDKL(input_dim=data_dim, z_dim=2, kernel='RBF') >>> Train a model >>> dkl.fit(rng_key, X_train, y_train, num_steps=1000, step_size=0.005) >>> # Obtain posterior mean and variance ('uncertainty') at new inputs >>> y_mean, y_var = dkl.predict(key2, X_new)
- single_fit(rng_key, X, y, num_steps=1000, step_size=0.005, print_summary=True, progress_bar=True, **kwargs)[source]
Optimizes parameters of a single DKL model
- Return type:
None
- fit(rng_key, X, y, num_steps=1000, step_size=0.005, print_summary=True, progress_bar=True, **kwargs)[source]
Run stochastic variational inference to learn a DKL model(s) parameters
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – Input high-dimensional featuresy (
Array) – Target output (scalar of vector)num_steps (
int) – number of SVI stepsstep_size (
float) – step size schedule for Adam optimizerprint_summary (
bool) – print summary at the end of samplingprogress_bar – show progress bar (works only for scalar outputs)
- get_mvn_posterior(X_new, nn_params, k_params, noiseless=False, y_residual=None, **kwargs)[source]
Returns predictive mean and covariance at new points (mean and cov, where cov.diagonal() is ‘uncertainty’) given a single set of DKL parameters
- Return type:
Tuple[Array,Array]
- sample_from_posterior(rng_key, X_new, n=1000, noiseless=False, **kwargs)[source]
Samples from the DKL posterior at X_new points
- Return type:
Tuple[Array]
- get_samples()[source]
Returns a tuple with trained NN weights and kernel hyperparameters
- Return type:
Tuple[Dict[str,Array]]
- predict_in_batches(rng_key, X_new, batch_size=100, params=None, noiseless=False, **kwargs)[source]
Make prediction at X_new with sampled DKL parameters by spitting the input array into chunks (“batches”) and running self.predict on each of them one-by-one to avoid a memory overflow
- Return type:
Tuple[Array,Array]
- predict(rng_key, X_new, params=None, noiseless=False, *args, **kwargs)[source]
Make prediction at X_new points using a trained DKL model(s)
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – New inputsparams (
Optional[Tuple[Dict[str,Array]]]) – Tuple with neural network weigths and kernel parameters (optional)noiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise for the training data, we also want to include that noise in our prediction.
- Return type:
Tuple[Array,Array]- Returns:
Predictive mean and variance
- fit_predict(rng_key, X, y, X_new, num_steps=1000, step_size=0.005, n_models=1, batch_size=100, noiseless=False, ensemble_method='vectorized', print_summary=True, progress_bar=True, **kwargs)[source]
Run SVI to learn DKL model(s) parameters and make a prediction with trained model(s) on new data. Allows using an ensemble of models.
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – Input high-dimensional featuresy (
Array) – Target output (scalar of vector)X_new (
Array) – New (‘test’) datanum_steps (
int) – number of SVI stepsstep_size (
float) – step size schedule for Adam optimizern_models (
int) – number of models in the ensemble (defaults to 1)batch_size (
int) – prediction batch size (to avoid memory overflows)noiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise for the training data, we also want to include that noise in our prediction.ensemble_method (
str) – ‘vectorized’ (single GPU) or ‘parallel’ (multiple GPUs)print_summary (
bool) – print summary at the end of samplingprogress_bar – show progress bar (works only for scalar outputs)
- Return type:
Tuple[Array,Array]- Returns:
Predictive mean and variance
- embed(X_new)[source]
Use trained neural network(s) to embed the input data into the latent space(s)
- Return type:
Array
- sample_from_prior(rng_key, X, num_samples=10)
Samples from prior predictive distribution at X
Infinite-width Bayesian Neural Networks
- class gpax.models.ibnn.iBNN(input_dim, depth=3, activation='erf', mean_fn=None, nngp_prior=None, mean_fn_prior=None, noise_prior=None, noise_prior_dist=None)[source]
Bases:
ExactGPInfinite-width Bayesian neural net (iBNN)
- Parameters:
input_dim (
int) – Number of input dimensionsdepth (
int) – The number of layers in the corresponding infinite-width neural network.activation (
str) – activation function (‘erf’ or ‘relu’)mean_fn (
Optional[Callable[[Array,Dict[str,Array]],Array]]) – Optional deterministic mean function (use ‘mean_fn_priors’ to make it probabilistic)nngp_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional custom priors over NNGP kernel hyperparameters; uses LogNormal(0,1) by defaultmean_fn_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional priors over mean function parametersnoise_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over the observational noise variance. Defaults to LogNormal(0,1).
- fit(rng_key, X, y, num_warmup=2000, num_samples=2000, num_chains=1, chain_method='sequential', progress_bar=True, print_summary=True, device=None, **kwargs)
Run Hamiltonian Monter Carlo to infer the GP parameters
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – 2D feature vectory (
Array) – 1D target vectornum_warmup (
int) – number of HMC warmup statesnum_samples (
int) – number of HMC samplesnum_chains (
int) – number of HMC chainschain_method (
str) – ‘sequential’, ‘parallel’ or ‘vectorized’progress_bar (
bool) – show progress barprint_summary (
bool) – print summary at the end of samplingdevice (
Type[Device]) – optionally specify a cpu or gpu device on which to run the inference; e.g.,device=jax.devices("cpu")[0]**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
None
- get_mvn_posterior(X_new, params, noiseless=False, **kwargs)
Returns parameters (mean and cov) of multivariate normal posterior for a single sample of GP parameters
- Return type:
Tuple[Array,Array]
- get_samples(chain_dim=False)
Get posterior samples (after running the MCMC chains)
- Return type:
Dict[str,Array]
- model(X, y=None, **kwargs)
GP probabilistic model with inputs X and targets y
- Return type:
None
- predict(rng_key, X_new, samples=None, n=1, filter_nans=False, noiseless=False, device=None, **kwargs)
Make prediction at X_new points using posterior samples for GP parameters
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – new inputs with (number of points, number of features) dimensionssamples (
Optional[Dict[str,Array]]) – optional (different) samples with GP parametersn (
int) – number of samples from Multivariate Normal posterior for each HMC sample with GP parametersfilter_nans (
bool) – filter out samples containing NaN values (if any)noiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise by default for the training data, we also want to include that noise in our prediction.device (
Type[Device]) – optionally specify a cpu or gpu device on which to make a prediction; e.g.,`device=jax.devices("gpu")[0]`**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
Tuple[Array,Array]
- Returns
Center of the mass of sampled means and all the sampled predictions
- predict_in_batches(rng_key, X_new, batch_size=100, samples=None, n=1, filter_nans=False, predict_fn=None, noiseless=False, device=None, **kwargs)
Make prediction at X_new with sampled GP parameters by spitting the input array into chunks (“batches”) and running predict_fn (defaults to self.predict) on each of them one-by-one to avoid a memory overflow
- Return type:
Tuple[Array,Array]
- sample_from_prior(rng_key, X, num_samples=10)
Samples from prior predictive distribution at X
Multi-Task Learning
- class gpax.models.mtgp.MultiTaskGP(input_dim, data_kernel, num_latents=None, shared_input_space=False, num_tasks=None, rank=None, mean_fn=None, data_kernel_prior=None, mean_fn_prior=None, noise_prior=None, noise_prior_dist=None, lengthscale_prior_dist=None, W_prior_dist=None, v_prior_dist=None, output_scale=False, **kwargs)[source]
Bases:
ExactGPGaussian process for multi-task/fidelity learning
- Parameters:
input_dim (
int) – Number of input dimensionsdata_kernel (
str) – Kernel function operating on data inputs (‘RBF’, ‘Matern’, ‘Periodic’, or a custom function)num_latents (
int) – Number of latent functions. Typically equal to or less than the number of tasksshared_input_space (
bool) – If True, assumes that all tasks share the same input space and uses a multivariate kernel (Kronecker product). If False (default), assumes that different tasks have different number of observations and uses a multitask kernel (elementwise multiplication). In that case, the task indices must be appended as the last column of the input vector.num_tasks (
int) – Number of tasks. This is only needed if shared_input_space is True.rank (
Optional[int]) – Rank of the weight matrix in the task kernel. Cannot be larger than the number of tasks. Higher rank implies higher correlation. Uses (num_tasks - 1) when not specified.mean_fn (
Optional[Callable[[Array,Dict[str,Array]],Array]]) – Optional deterministic mean function (use ‘mean_fn_priors’ to make it probabilistic)data_kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional custom priors over the data kernel hyperparametersmean_fn_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional priors over mean function parametersnoise_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over the observational noise variance. Defaults to LogNormal(0,1).lengthscale_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over kernel lengthscale. Defaults to LogNormal(0, 1)W_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over W in the task kernel, \(WW^T + diag(v)\). Defaults to Normal(0, 10).v_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over v in the task kernel, \(WW^T + diag(v)\). Must be non-negative. Defaults to LogNormal(0, 1)task_kernel_prior – Optional custom priors over task kernel parameters; Defaults to Normal(0, 10) for weights W and LogNormal(0, 1) for variances v.
output_scale (
bool) – Option to sample data kernel’s output scale. Defaults to False to avoid over-parameterization (the scale is already absorbed into task kernel)
- model(X, y=None, **kwargs)[source]
Multitask GP probabilistic model with inputs X and targets y
- Return type:
None
- fit(rng_key, X, y, num_warmup=2000, num_samples=2000, num_chains=1, chain_method='sequential', progress_bar=True, print_summary=True, device=None, **kwargs)
Run Hamiltonian Monter Carlo to infer the GP parameters
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – 2D feature vectory (
Array) – 1D target vectornum_warmup (
int) – number of HMC warmup statesnum_samples (
int) – number of HMC samplesnum_chains (
int) – number of HMC chainschain_method (
str) – ‘sequential’, ‘parallel’ or ‘vectorized’progress_bar (
bool) – show progress barprint_summary (
bool) – print summary at the end of samplingdevice (
Type[Device]) – optionally specify a cpu or gpu device on which to run the inference; e.g.,device=jax.devices("cpu")[0]**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
None
- get_mvn_posterior(X_new, params, noiseless=False, **kwargs)
Returns parameters (mean and cov) of multivariate normal posterior for a single sample of GP parameters
- Return type:
Tuple[Array,Array]
- get_samples(chain_dim=False)
Get posterior samples (after running the MCMC chains)
- Return type:
Dict[str,Array]
- predict(rng_key, X_new, samples=None, n=1, filter_nans=False, noiseless=False, device=None, **kwargs)
Make prediction at X_new points using posterior samples for GP parameters
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – new inputs with (number of points, number of features) dimensionssamples (
Optional[Dict[str,Array]]) – optional (different) samples with GP parametersn (
int) – number of samples from Multivariate Normal posterior for each HMC sample with GP parametersfilter_nans (
bool) – filter out samples containing NaN values (if any)noiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise by default for the training data, we also want to include that noise in our prediction.device (
Type[Device]) – optionally specify a cpu or gpu device on which to make a prediction; e.g.,`device=jax.devices("gpu")[0]`**jitter – Small positive term added to the diagonal part of a covariance matrix for numerical stability (Default: 1e-6)
- Return type:
Tuple[Array,Array]
- Returns
Center of the mass of sampled means and all the sampled predictions
- predict_in_batches(rng_key, X_new, batch_size=100, samples=None, n=1, filter_nans=False, predict_fn=None, noiseless=False, device=None, **kwargs)
Make prediction at X_new with sampled GP parameters by spitting the input array into chunks (“batches”) and running predict_fn (defaults to self.predict) on each of them one-by-one to avoid a memory overflow
- Return type:
Tuple[Array,Array]
- sample_from_prior(rng_key, X, num_samples=10)
Samples from prior predictive distribution at X
- class gpax.models.vi_mtdkl.viMTDKL(input_dim, z_dim=2, data_kernel='RBF', num_latents=None, shared_input_space=False, num_tasks=None, rank=None, data_kernel_prior=None, nn=None, nn_prior=True, guide='delta', W_prior_dist=None, v_prior_dist=None, task_kernel_prior=None, **kwargs)[source]
Bases:
viDKLImplementation of the variational inference-based deep kernel learning for multi-task/fidelity problems
- Parameters:
input_dim (
int) – Number of input dimensions, not counting the column with task indices (if any)z_dim (
int) – Latent space dimensionality (defaults to 2)data_kernel (
str) – Kernel function operating on data inputs (‘RBF’, ‘Matern’, ‘Periodic’, or a custom function)num_latents (
int) – Number of latent functions. Typically equal to or less than the number of tasksshared_input_space (
bool) – If True, assumes that all tasks share the same input space and uses a multivariate kernel (Kronecker product). If False (default), assumes that different tasks have different number of observations and uses a multitask kernel (elementwise multiplication). In that case, the task indices must be appended as the last column of the input vector.num_tasks (
int) – Number of tasks. This is only needed if shared_input_space is True.rank (
Optional[int]) – Rank of the weight matrix in the task kernel. Cannot be larger than the number of tasks. Higher rank implies higher correlation. Uses (num_tasks - 1) when not specified.data_kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional priors over kernel hyperparameters; uses LogNormal(0,1) by defaultnn (
Optional[Callable[[Array],Array]]) – Custom neural network (‘feature extractor’); uses a 3-layer MLP with ReLU activations by defaultnn_prior (
bool) – Places probabilistic priors over NN weights and biases (Default: True)latent_prior – Optional prior over the latent space (NN embedding); uses none by default
guide (
str) – Auto-guide option, use ‘delta’ (default) or ‘normal’W_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over W in the task kernel, \(WW^T + diag(v)\). Defaults to Normal(0, 10).v_prior_dist (
Optional[Distribution]) – Optional custom prior distribution over v in the task kernel, \(WW^T + diag(v)\). Must be non-negative. Defaults to LogNormal(0, 1)task_kernel_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional custom priors over task kernel parameters; Defaults to Normal(0, 10) for weights W and LogNormal(0, 1) for variances v.**kwargs – Optional custom prior distributions over observational noise (noise_dist_prior) and kernel lengthscale (lengthscale_prior_dist)
- get_mvn_posterior(X_new, nn_params, k_params, noiseless=False, y_residual=None, **kwargs)[source]
Returns predictive mean and covariance at new points (mean and cov, where cov.diagonal() is ‘uncertainty’) given a single set of DKL parameters
- Return type:
Tuple[Array,Array]
- embed(X_new)
Use trained neural network(s) to embed the input data into the latent space(s)
- Return type:
Array
- fit(rng_key, X, y, num_steps=1000, step_size=0.005, print_summary=True, progress_bar=True, **kwargs)
Run stochastic variational inference to learn a DKL model(s) parameters
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – Input high-dimensional featuresy (
Array) – Target output (scalar of vector)num_steps (
int) – number of SVI stepsstep_size (
float) – step size schedule for Adam optimizerprint_summary (
bool) – print summary at the end of samplingprogress_bar – show progress bar (works only for scalar outputs)
- fit_predict(rng_key, X, y, X_new, num_steps=1000, step_size=0.005, n_models=1, batch_size=100, noiseless=False, ensemble_method='vectorized', print_summary=True, progress_bar=True, **kwargs)
Run SVI to learn DKL model(s) parameters and make a prediction with trained model(s) on new data. Allows using an ensemble of models.
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – Input high-dimensional featuresy (
Array) – Target output (scalar of vector)X_new (
Array) – New (‘test’) datanum_steps (
int) – number of SVI stepsstep_size (
float) – step size schedule for Adam optimizern_models (
int) – number of models in the ensemble (defaults to 1)batch_size (
int) – prediction batch size (to avoid memory overflows)noiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise for the training data, we also want to include that noise in our prediction.ensemble_method (
str) – ‘vectorized’ (single GPU) or ‘parallel’ (multiple GPUs)print_summary (
bool) – print summary at the end of samplingprogress_bar – show progress bar (works only for scalar outputs)
- Return type:
Tuple[Array,Array]- Returns:
Predictive mean and variance
- get_samples()
Returns a tuple with trained NN weights and kernel hyperparameters
- Return type:
Tuple[Dict[str,Array]]
- predict(rng_key, X_new, params=None, noiseless=False, *args, **kwargs)
Make prediction at X_new points using a trained DKL model(s)
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – New inputsparams (
Optional[Tuple[Dict[str,Array]]]) – Tuple with neural network weigths and kernel parameters (optional)noiseless (
bool) – Noise-free prediction. It is set to False by default as new/unseen data is assumed to follow the same distribution as the training data. Hence, since we introduce a model noise for the training data, we also want to include that noise in our prediction.
- Return type:
Tuple[Array,Array]- Returns:
Predictive mean and variance
- predict_in_batches(rng_key, X_new, batch_size=100, params=None, noiseless=False, **kwargs)
Make prediction at X_new with sampled DKL parameters by spitting the input array into chunks (“batches”) and running self.predict on each of them one-by-one to avoid a memory overflow
- Return type:
Tuple[Array,Array]
- sample_from_posterior(rng_key, X_new, n=1000, noiseless=False, **kwargs)
Samples from the DKL posterior at X_new points
- Return type:
Tuple[Array]
- sample_from_prior(rng_key, X, num_samples=10)
Samples from prior predictive distribution at X
- single_fit(rng_key, X, y, num_steps=1000, step_size=0.005, print_summary=True, progress_bar=True, **kwargs)
Optimizes parameters of a single DKL model
- Return type:
None
Structured Probabilistic Models
- class gpax.models.spm.sPM(model, model_prior, noise_prior=None, noise_prior_dist=None)[source]
Bases:
objectBayesian inference with structured probabilistic model. Serves as a wrapper over the NumPyro’s functions for Bayesian inference on probabilistic models.
- Parameters:
model (
Callable[[Array,Dict[str,Array]],Array]) – Deterministic model of expected system’s behavior.model_prior (
Callable[[],Dict[str,Array]]) – Priors over model parametersnoise_prior (
Optional[Callable[[],Dict[str,Array]]]) – Optional custom prior for observation noise; uses LogNormal(0,1) by default.
- fit(rng_key, X, y, num_warmup=2000, num_samples=2000, num_chains=1, chain_method='sequential', progress_bar=True, print_summary=True, device=None)[source]
Run HMC to infer parameters of the structured probabilistic model
- Parameters:
rng_key (
array) – random number generator keyX (
Array) – 1D or 2D ‘feature vector’ with \((n,)\) or \(n x num_features\) dimensionsy (
Array) – 1D ‘target vector’ with \((n,)\) dimensionsnum_warmup (
int) – number of HMC warmup statesnum_samples (
int) – number of HMC samplesnum_chains (
int) – number of HMC chainschain_method (
str) – ‘sequential’, ‘parallel’ or ‘vectorized’progress_bar (
bool) – show progress barprint_summary (
bool) – print summary at the end of samplingdevice (
Type[Device]) – optionally specify a cpu or gpu device on which to run the inference; e.g.,device=jax.devices("cpu")[0]
- Return type:
None
- get_samples(chain_dim=False)[source]
Get posterior samples (after running the MCMC chains)
- Return type:
Dict[str,Array]
- sample_from_prior(rng_key, X, num_samples=10)[source]
Samples from prior predictive distribution at X
- predict(rng_key, X_new, samples=None, n=1, filter_nans=False, take_point_predictions_mean=True, device=None)[source]
Make prediction at X_new points using posterior model parameters
- Parameters:
rng_key (
Array) – random number generator keyX_new (
Array) – 2D vector with new/’test’ data of \(n x num_features\) dimensionalitysamples (
Optional[Dict[str,Array]]) – optional posterior samplesn (
int) – number of samples to draw from normal distribution per single HMC samplefilter_nans (
bool) – filter out samples containing NaN values (if any)take_point_predictions_mean (
bool) – take a mean of point predictions (without sampling from the normal distribution)device (
Type[Device]) – optionally specify a cpu or gpu device on which to make a prediction; e.g.,`device=jax.devices("gpu")[0]`
- Return type:
Tuple[Array,Array]- Returns:
Point predictions (or their mean) and posterior predictive distribution