Reference of internal functions

In this reference, you will find a detailed overview of internal functions. They are documented here mostly for development of the package. They are not part of the public API and may change without notice.

HybridVariationalInference.OneBasedVectorWithZeroType
OneBasedVectorWithZero(data)

A thin wrapper over an AbstractVector that exposes a linear 1-based indexing API mapping v[i] to data[axes(data, 1)[i]] on the underlying storage and provides a value at index 0 (defaulting to zero) that is not stored in the underlying vector.

Example usage:

v = HybridVariationalInference.OneBasedVectorWithZero([10,20,30])
v[1] == 10
v[2] == 20
v[3] == 30
v[0] == 0 # default value at index 0 is zero
v[[1,0,0,3]] == [10,0,0,30]
source
CommonSolve.solveMethod
solve(prob::AbstractHybridProblem, solver::HybridPosteriorSolver; ...)

Perform the inversion of HVI Problem.

Optional keyword arguments

  • prob: The AbstractHybridProblem to solve.
  • scenario: Scenario to query prob, defaults to Val(()).
  • rng: Random generator, defaults to Random.default_rng().
  • gdevs: NamedTuple (;gdev_M, gdev_P) functions to move computation and data of ML model on and PBM respectively to gpu (e.g. gpu_device() or cpu (identity). defaults to get_gdev_MP(scenario)
  • θmean_quant default to 0.0: deprecated
  • is_inferred: set to Val(true) to activate type stability checks

Returns a NamedTuple of

  • probo: A copy of the HybridProblem, with updated optimized parameters
  • interpreters: TODO
  • ϕ: the optimized HVI parameters: a ComponentVector with entries
    • ϕg: The ML model parameter vector,
    • ϕq: ComponentVector of non-ML parameters, including μP: ComponentVector of the mean global PBM parameters at unconstrained scale
  • θP: ComponentVector of the mean global PBM parameters at constrained scale
  • resopt: the structure returned by Optimization.solve. It can contain more information on convergence.
source
HybridVariationalInference.check_overdispersionMethod
overdispersion_test(Y, μ, Σ; α=0.05)

Test whether the q×p sample matrix Y (rows = individuals) is overdispersed relative to the reference distribution N(μ, Σ).

Returns: Sn, E0, Var0, Z, pvaluenormal, pvalue_chisq

source
HybridVariationalInference.compose_axesMethod
compose_axes(axtuples::NamedTuple)

Create a new 1d-axis that combines several other named axes-tuples such as of key = getaxes(::AbstractComponentArray).

The new axis consists of several ViewAxes. If an axis-tuple consists only of one axis, it is used for the view. Otherwise a ShapedAxis is created with the axes-length of the others, essentially dropping component information that might be present in the dimensions.

source
HybridVariationalInference.compute_pvalue_asymptotic_overdispersion_from_dist2Method
compute_pvalue_asymptotic_overdispersion_from_dist2(dist2_matrix)

Compute p-value for overdispersion using asymptotic approximation, based on a precomputed matrix of squared Mahalanobis distances.

Arguments

  • dist2_matrix: m × m symmetric matrix of squared Mahalanobis distances (dist2matrix[i,j] = (xi - xj)' Σ⁻¹ (xi - x_j))
  • n: the dimension of x_i (number of variables)
  • The matrix must be symmetric and contain only upper/lower triangle values

Returns

  • p_value: one-sided p-value for overdispersion
source
HybridVariationalInference.generate_ζMethod

Generate samples of (inv-transformed) model parameters, ζ, and the vector of standard deviations, σ, i.e. the diagonal of the cholesky-factor.

Adds the MV-normally distributed residuals, retrieved by sample_ζresid_norm to the means extracted from parameters and predicted by the machine learning model.

The output shape of size (n_site x n_par x n_MC) is tailored to iterating each MC sample and then transforming each parameter on block across sites.

source
HybridVariationalInference.get_loss_elboMethod

Create a loss function for parameter vector ϕ, given

  • g(x, ϕ): machine learning model
  • transPMS: transformation from unconstrained space to parameter space
  • f(θMs_tr, θP): mechanistic model
  • interpreters: assigning structure to pure vectors, see neg_elbo_gtf
  • n_MC: number of Monte-Carlo sample to approximate the expected value across distribution
  • pbm_covars: tuple of symbols of process-based parameters provided to the ML model
  • θP: ComponentVector as a template to select indices of pbm_covars

The loss function takes in addition to ϕ, data that changes with minibatch

  • rng: random generator
  • xM: matrix of covariates, sites in columns
  • xP: drivers for the processmodel: Iterator of size n_site
  • y_o, y_unc: matrix of observations and uncertainties, sites in columns
source
HybridVariationalInference.insert_zerosMethod
insert_zeros(v, positions)

Return a new vector with zero(eltype(v)) inserted at each position in positions. Positions are applied in order against the growing vector (as if sequential inserts), so later indices are interpreted on the updated result. Only one output vector is allocated.

source
HybridVariationalInference.neg_elbo_ζtfMethod

Compute the neg_elbo for each sampled parameter vector (last dimension of ζs).

  • Transform and compute log-jac
  • call forward model
  • compute log-density of joint density of predictions and unconstrained parameters, nLjoint and its components
    • nLy: The likelihood of the data, given the parameters
    • neg_log_prior: the prior of parameters at constrained scale
    • logjac, negative logarithm of the absolute value of the determinant of the Jacobian of the transformation θ=T(ζ).
  • loss_penalty: additional loss terms from penalty_computer
  • compute entropy of transformation
source
HybridVariationalInference.refit_clustersMethod
refit_clusters(rng, probo, solver, xM; scenario, n_cluster_initial, n_aggsplits, epochs)

Iteratively refit the model and split clusters of sites based on overdispersion tests.

When several sites are within one cluster, they are treated in a way, such that all the observations constrain the uncertainty of the mean estimate within that cluster. The fewer sites are within one cluster, the higher the uncertainty estimate.

This methods implements a strategy to start with few clusters and checks if the distribution of predicted site values within a cluster is overdispersed relative to the uncertainty predicted for the cluster. If so, the cluster is split into smaller clusters and the model is refitted. Because, the refitting changes the uncertainty estimates, only the few (1/10th) clusters with the most sites are checked for overdispersion, and then a refitting takes place before checking the next clusters.

Arguments

  • rng: random number generator for reproducibility
  • probo: the probabilistic model to fit
  • solver: optimization algorithm for fitting
  • xM: input data for the model
  • scenario: optional argument for different scenarios in the model
  • n_cluster_initial: number of clusters to start with
  • n_aggsplits: number of clusters to split before refitting
  • epochs: number of epochs for refitting after each series of splits

Returns

  • probo: the refitted probabilistic model
  • clusters: final cluster assignments for each site
source
HybridVariationalInference.reshape_penalty_matrixMethod
reshape_penalty_matrix(penalty::NamedTuple{KEYS}) where KEYS
reshape_penalty_matrix(penalty::ComponentVector{ET,KEYS}) where {ET,KEYS}

Reshape the output of the penalty computer to a ComponentMatrix. Assuming that all the component in penalty are of the same element type and length.

source
HybridVariationalInference.sample_ζresid_normMethod

Extract relevant parameters from ζ and return nMC generated multivariate normal draws together with the vector of standard deviations, σ: `(ζPresids, ζMsparfirstresids, σ)The output shape(nθ, nsite?, nMC)is tailored to addingζMsparfirstresidsto ML-model predcitions of size(nθM, n_site)`.

Arguments

  • int_ϕq: Interpret vector as ComponentVector with components ρsP, ρsM, logσ2ζP, coeflogσ2_ζMs(intercept + slope),
source
HybridVariationalInference.take_n!Method
take_n!(itr, n)

Peel off the first n elements of an drop-iterator itr and return them as a vector, while mutating itr to now start after those n elements.

Examples

it = HybridVariationalInference.drop_iterate(1:5) # initialize the iterator

a1 = HybridVariationalInference.take_n!(it,3)
collect(a1) == [1,2,3]

a2 = HybridVariationalInference.take_n!(it,3)
collect(a2) == [4,5]  # only two element left, so return those

a3 = HybridVariationalInference.take_n!(it,3)
collect(a3) == [] # no elements left, so return empty vector
source
HybridVariationalInference.transformU_blocks_cholesky1Method
transformU_block_cholesky1(v::AbstractVector, cor_ends)

Transform a parameterization, v, of a blockdiagonal of upper triangular matrices into a vector with a matrix for each block. cor_ends is an AbstractVector of Integers specifying the last column of each block. E.g. For a matrix with a 3x3, a 2x2, and another single-entry block, the blocks start at columns (3,5,6). It defaults to a single entire block.

An correlation parameterization can parameterize a block of a single parameter, or an empty parameter block. To indicate the empty block, provide cor_ends == [0].

source
HybridVariationalInference.transformU_cholesky1Method

Takes a vector of parameters for UnitUpperTriangular matrix and transforms it to an UpperTriangular that satisfies diag(U' * U) = 1.

This can be used to fit parameters that yield an upper Cholesky-Factor of a Covariance matrix.

It uses the upper triangular matrix rather than the lower because it involves a sum across columns, whereas the alternative of a lower triangular uses sum across rows. Sum across columns is often faster, because entries of columns are contiguous.

An empty parameterization

source
HybridVariationalInference.transpose_mPMs_sitefirstMethod

Transforms each row of a matrix (nMC x nPar) with site parameters Ms inside nPar of form (npar x nsite) to Ms of the form (nsite x n_par), i.e. neighboring entries (inside a column) are of the same parameter.

This format of having n_par as the last dimension helps transforming parameters on block.

source
HybridVariationalInference.vectuptotupvecMethod
vectuptotupvec(vectup)
vectuptotupvec_allowmissing(vectup)

Typesafe convert from Vector of Tuples to Tuple of Vectors. The first variant does not allow for missing in vectup. The second variant allows for missing but has eltype of Union{Missing, ...} in all components of the returned Tuple, also when there were not missing in vectup.

Arguments

  • vectup: A Vector of identical Tuples

Examples

vectup = [(1,1.01, "string 1"), (2,2.02, "string 2")] 
HybridVariationalInference.vectuptotupvec_allowmissing(vectup) == 
  ([1, 2], [1.01, 2.02], ["string 1", "string 2"])
source