learner {learner}R Documentation

Latent space-based transfer learning

Description

This function applies the LatEnt spAce-based tRaNsfer lEaRning (LEARNER) method (McGrath et al. 2024) to leverage data from a source population to improve estimation of a low rank matrix in an underrepresented target population.

Usage

learner(Y_source, Y_target, r, lambda_1, lambda_2, step_size, control = list())

Arguments

Y_source

matrix containing the source population data

Y_target

matrix containing the target population data

r

(optional) integer specifying the rank of the knowledge graphs. By default, ScreeNOT (Donoho et al. 2023) is applied to the source population knowledge graph to select the rank.

lambda_1

numeric scalar specifying the value of \lambda_1 (see Details)

lambda_2

numeric scalar specifying the value of \lambda_2 (see Details)

step_size

numeric scalar specifying the step size for the Newton steps in the numerical optimization algorithm

control

a list of parameters for controlling the stopping criteria for the numerical optimization algorithm. The list may include the following components:

max_iter integer specifying the maximum number of iterations
threshold numeric scalar specifying a convergence threshold. The algorithm converges when |\epsilon_t - \epsilon_{t-1}| < threshold, where \epsilon_t denotes the value of the objective function at iteration t.
max_value numeric scalar used to specify the maximum value of the objective function allowed before terminating the algorithm. Specifically, the algorithm will terminate if the value of the objective function exceeds max_value\times \epsilon_0, where \epsilon_0 denotes the value of the objective function at the initial point. This is used to prevent unnecessary computation time after the optimization algorithm diverges.

Details

Data and notation:

The data consists of a matrix in the target population Y_0 \in \mathbb{R}^{p \times q} and the source population Y_1 \in \mathbb{R}^{p \times q}. Let \hat{U}_{k} \hat{\Lambda}_{k} \hat{V}_{k}^{\top} denote the truncated singular value decomposition (SVD) of Y_k, k = 0, 1.

For k = 0, 1, one can view Y_k as a noisy version of \Theta_k, referred to as the knowledge graph. The target of inference is the target population knowledge graph, \Theta_0.

Estimation:

This method estimates \Theta_0 by \tilde{U}\tilde{V}^{\top}, where (\tilde{U}, \tilde{V}) is the solution to the following optimization problem

\mathrm{arg\,min}_{U \in \mathbb{R}^{p \times r}, V \in \mathbb{R}^{q \times r}} \big\{ \| U V^{\top} - Y_0 \|_F^2 + \lambda_1\| \mathcal{P}_{\perp}(\hat{U}_{1})U \|_F^2 + \lambda_1\| \mathcal{P}_{\perp}(\hat{V}_{1})V \|_F^2 + \lambda_2 \| U^{\top} U - V^{\top} V \|_F^2 \big\}

where \mathcal{P}_{\perp}(\hat{U}_{1}) = I - \hat{U}_{1}^{\top}\hat{U}_{1} and \mathcal{P}_{\perp}(\hat{V}_{1}) = I - \hat{V}_{1}^{\top}\hat{V}_{1}.

This function uses an alternating minimization strategy to solve the optimization problem. That is, this approach updates U by minimizing the objective function (via a gradient descent step) treating V as fixed. Then, V is updated treating U as fixed. These updates of U and V are repeated until convergence.

Value

A list with the following elements:

learner_estimate

matrix containing the LEARNER estimate of the target population knowledge graph

objective_values

numeric vector containing the values of the objective function at each iteration

convergence_criterion

integer specifying the criterion that was satisfied for terminating the numerical optimization algorithm. A value of 1 indicates the convergence threshold was satisfied; A value of 2 indicates that the maximum number of iterations was satisfied; A value of 3 indicates that the maximum value of the objective function was satisfied.

r

rank value used.

References

McGrath, S., Zhu, C,. Guo, M. and Duan, R. (2024). LEARNER: A transfer learning method for low-rank matrix estimation. arXiv preprint arXiv:2412.20605.

Donoho, D., Gavish, M. and Romanov, E. (2023). ScreeNOT: Exact MSE-optimal singular value thresholding in correlated noise. The Annals of Statistics, 51(1), pp.122-148.

Examples

res <- learner(Y_source = dat_highsim$Y_source,
               Y_target = dat_highsim$Y_target,
               lambda_1 = 1, lambda_2 = 1,
               step_size = 0.003)


[Package learner version 1.0.0 Index]