Estimates a covariate-varying network model (CVN), i.e., \(m\) Gaussian graphical models that change with (multiple) external covariate(s). The smoothing between the graphs is specified by the \((m \times m)\)-dimensional weight matrix \(W\). The function returns the estimated precision matrices for each graph.
Usage
CVN(
data,
W,
lambda1 = 1:2,
lambda2 = 1:2,
gamma1 = NULL,
gamma2 = NULL,
rho = 1,
eps = 1e-04,
maxiter = 100,
truncate = 1e-05,
rho_genlasso = 1,
eps_genlasso = 1e-10,
maxiter_genlasso = 100,
truncate_genlasso = 1e-04,
n_cores = min(length(lambda1) * length(lambda2), detectCores() - 1),
normalized = FALSE,
warmstart = TRUE,
minimal = FALSE,
gamma_ebic = 0.5,
verbose = TRUE
)
Arguments
- data
A list with matrices, each entry associated with a single graph. The number of columns should be the same for each matrix. Number of observations can differ
- W
The \((m \times m)\)-dimensional symmetric weight matrix \(W\)
- lambda1
Vector with different \(\lambda_1\). LASSO penalty terms (Default:
1:2
)- lambda2
Vector with different \(\lambda_2\). The global smoothing parameter values (Default:
1:2
)- gamma1
A vector of \(\gamma_1\)'s LASSO penalty terms, where \(\gamma_1 = \frac{2 \lambda_1}{m p (1 - p)}\). If
gamma1
is set, the value oflambda1
is ignored. (Default:NULL
).- gamma2
A vector of \(\gamma_2\)'s global smoothing parameters, where that \(\gamma_2 = \frac{4 \lambda_2}{m(m-1)p(p-1)}\). If
gamma2
is set, the value oflambda2
is ignored.(Default:NULL
).- rho
The \(\rho\) penalty parameter for the global ADMM algorithm (Default:
1
)- eps
If the relative difference between two update steps is smaller than \(\epsilon\), the algorithm stops. (Default:
1e-4
)- maxiter
Maximum number of iterations (Default:
100
)- truncate
All values of the final \(\hat{\Theta}_i\)'s below
truncate
will be set to0
. (Default:1e-5
)- rho_genlasso
The \(\rho\) penalty parameter for the ADMM algorithm used to solve the generalized LASSO (Default:
1
)- eps_genlasso
If the relative difference between two update steps is smaller than \(\epsilon\), the algorithm stops. (Default:
1e-10
)- maxiter_genlasso
Maximum number of iterations for solving the generalized LASSO problem (Default:
100
)- truncate_genlasso
All values of the final \(\hat{\beta}\) below
truncate_genlasso
will be set to0
. (Default:1e-4
)- n_cores
Number of cores used (Default: max. number of cores - 1, or the total number penalty term pairs if that is less)
- normalized
Data is normalized if
TRUE
. Otherwise the data is only centered (Default:FALSE
)- warmstart
If
TRUE
, use theglasso
package for estimating the individual graphs first (Default:TRUE
)- minimal
If
TRUE
, the returnedcvn
is minimal in terms of memory, i.e.,Theta
,data
andSigma
are not returned (Default:FALSE
)- gamma_ebic
Gamma value for the eBIC (Default: 0.5)
- verbose
Verbose (Default:
TRUE
)
Value
A CVN
object containing the estimates for all the graphs
for each different value of \((\lambda_1, \lambda_2)\). General results for
the different values of \((\lambda_1, \lambda_2)\) can be found in the data frame
results
. It consists of multiple columns, namely:
id
The id. This corresponds to the indices of the lists
lambda1
\(\lambda_1\) value
lambda2
\(\lambda_2\) value
gamma1
\(\gamma_1\) value
gamma2
\(\gamma_2\) value
converged
whether algorithm converged or not
value
value of the negative log-likelihood function
n_iterations
number of iterations of the ADMM
aic
Aikake information criterion
bic
Bayesian information criterion
ebic
Extended Bayesian information criterion
edges_median
Median number of edges across the m estimated graphs
edges_iqr
Interquartile range of edges across the m estimated graphs
The estimates of the precision matrices and the corresponding adjacency matrices for the different values of \((\lambda_1, \lambda_2)\) can be found
Theta
A list with the estimated precision matrices \(\{ \hat{\Theta}_i(\lambda_1, \lambda_2) \}_{i = 1}^m\), (only if
minimal = FALSE
)adj_matrices
A list with the estimated adjacency matrices corresponding to the estimated precision matrices in
Theta
. The entries are1
if there is an edge,0
otherwise. The matrices are sparse using packageMatrix
In addition, the input given to the CVN function is stored in the object as well:
Sigma
Empirical covariance matrices \(\{\hat{\Sigma}_i\}_{i = 1}^m\), (only if
minimal = FALSE
)m
Number of graphs
p
Number of variables
n_obs
Vector of length \(m\) with number of observations for each graph
data
The
data
, but then normalized or centered (only ifminimal = FALSE
)W
The \((m \times m)\)-dimensional weight matrix \(W\)
maxiter
Maximum number of iterations for the ADMM
rho
The \(\rho\) ADMM's penalty parameter
eps
The stopping criterion \(\epsilon\)
truncate
Truncation value for \(\{ \hat{\Theta}_i \}_{i = 1}^m\)
maxiter_genlasso
Maximum number of iterations for the generarlzed LASSO
rho_genlasso
The \(\rho\) generalized LASSO penalty parameter
eps_genlasso
The stopping criterion \(\epsilon\) for the generalized LASSO
truncate_genlasso
Truncation value for \(\beta\) of the generalized LASSO
n_lambda_values
Total number of \((\lambda_1, \lambda_2)\) value combinations
normalized
If
TRUE
,data
was normalized. Otherwisedata
was only centeredwarmstart
If
TRUE
, warmstart was usedminimal
If
TRUE
,data
,Theta
andSigma
are not addedhits_border_aic
If
TRUE
, the optimal model based on the AIC hits the border of \((\lambda_1, \lambda_2)\)hits_border_bic
If
TRUE
, the optimal model based on the BIC hits the border of \((\lambda_1, \lambda_2)\)
Reusing Estimates
When estimating the graph for different values of \(\lambda_1\) and \(\lambda_2\), we use the graph estimated (if available) for other \(\lambda_1\) and \(\lambda_2\) values closest to them.
Examples
data(grid)
#' Choice of the weight matrix W. Each of 2 covariates has 3 categories
#' (uniform random)
W <- create_weight_matrix("uniform-random", k = 3, l = 3)
# lambdas:
lambda1 = 1 # can also be lambda1 = 1:2
lambda2 = 1
(fit <- CVN(data = grid,
W = W,
lambda1 = lambda1, lambda2 = lambda2,
n_cores = 1,
eps = 1e-2, maxiter = 200, # fast but imprecise
verbose = TRUE))
#> Estimating a CVN with 9 graphs...
#>
#> Number of cores: 1
#> Uses a warmstart...
#>
#> -------------------------
#> iteration 1 | 2.180956
#> iteration 2 | 0.115992
#> iteration 3 | 0.085703
#> iteration 4 | 0.030387
#> iteration 5 | 0.022670
#> iteration 6 | 0.017581
#> iteration 7 | 0.016135
#> iteration 8 | 0.014122
#> iteration 9 | 0.011050
#> iteration 10 | 0.010618
#> -------------------------
#> iteration 11 | 0.009648
#> Covariate-varying Network (CVN)
#>
#> ✓ all converged
#>
#> Number of graphs (m) : 9
#> Number of variables (p) : 10
#> Number of lambda pairs : 1
#>
#> Weight matrix (W):
#> 9 x 9 sparse Matrix of class "dsCMatrix"
#>
#> [1,] . 0.6012593 0.3094466 0.3387542 0.2975096 0.2986163 0.3823848
#> [2,] 0.6012593 . 0.5036383 0.5257210 0.4768016 0.5083998 0.7264707
#> [3,] 0.3094466 0.5036383 . 0.2970355 0.2157332 0.2794692 0.3662958
#> [4,] 0.3387542 0.5257210 0.2970355 . 0.2893602 0.2810684 0.3744014
#> [5,] 0.2975096 0.4768016 0.2157332 0.2893602 . 0.2500744 0.3623793
#> [6,] 0.2986163 0.5083998 0.2794692 0.2810684 0.2500744 . 0.3741879
#> [7,] 0.3823848 0.7264707 0.3662958 0.3744014 0.3623793 0.3741879 .
#> [8,] 0.3184479 0.5514783 0.2465658 0.2723133 0.3065248 0.2787839 0.4334903
#> [9,] 0.2250348 0.4363431 0.3054915 0.2642960 0.1564069 0.2694153 0.3516440
#>
#> [1,] 0.3184479 0.2250348
#> [2,] 0.5514783 0.4363431
#> [3,] 0.2465658 0.3054915
#> [4,] 0.2723133 0.2642960
#> [5,] 0.3065248 0.1564069
#> [6,] 0.2787839 0.2694153
#> [7,] 0.4334903 0.3516440
#> [8,] . 0.2091313
#> [9,] 0.2091313 .
#>
#> id lambda1 lambda2 gamma1 gamma2 converged value n_iterations
#> 1 1 1 1 0.002469136 0.000617284 TRUE 0.009647579 12
#> aic bic ebic edges_median edges_iqr
#> 1 14315.63 15748.48 18281.32 31 1