Calculate Expected Gradients (GradSHAP) using native torch autograd.
This function provides a lightweight alternative to run_expgrad
and ExpectedGradient that is more efficient for torch models as
it avoids model conversion overhead and uses native torch autograd directly. Therefore,
it can be used for any torch::nn_module without restrictions on architecture or layers.
torch_expgrad(
model,
data,
data_ref = NULL,
output_idx = NULL,
n = 50,
dtype = "float",
return_object = FALSE
)(nn_module)
A torch model.
(torch_tensor, array, or matrix)
Input data.
(torch_tensor, array, or matrix)
Reference dataset for estimating conditional expectation.
If NULL, uses zeros. Default: NULL.
(integer)
Index or indices of output nodes. Default: NULL (all outputs).
(integer(1))
Number of reference samples and integration steps. Default: 50.
(character(1))
Data type: "float" or "double". Default: "float".
(logical(1))
If TRUE, returns a InterpretingMethod object with
methods like plot() and get_result(). If FALSE (default), returns
a raw torch_tensor.
If return_object = FALSE (default): A torch_tensor
containing the expected gradients with shape (batch_size, ..., n_outputs).
If return_object = TRUE: A InterpretingMethod object.
Expected Gradients extends Integrated Gradients by averaging over multiple reference values from a distribution:
$$E_{x'\sim X', \alpha \sim U(0,1)}[(x - x') \times \frac{\partial f(x' + \alpha (x - x'))}{\partial x}]$$
This provides approximate Shapley values.
G. Erion et al. (2021) Improving performance of deep learning models with axiomatic attribution priors and expected gradients. Nature Machine Intelligence 3, pp. 620-631.
Other direct torch methods:
torch_grad(),
torch_intgrad(),
torch_smoothgrad()
library(torch)
model <- nn_sequential(nn_linear(10, 3))
data <- torch_randn(5, 10)
references <- torch_randn(100, 10) # Reference distribution
# Calculate Expected Gradients
exp_grads <- torch_expgrad(model, data, data_ref = references)