Calculate Integrated Gradients using native torch autograd. This function
provides a lightweight alternative to run_intgrad and
IntegratedGradient that works
directly with any torch::nn_module objects.
torch_intgrad(
model,
data,
x_ref = NULL,
output_idx = NULL,
n = 50,
times_input = TRUE,
dtype = "float",
return_object = FALSE
)(nn_module)
A torch model. Must be an instance of nn_module.
(torch_tensor, array, or matrix)
Input data for which to calculate gradients.
(torch_tensor, array, or matrix)
Reference input (baseline). If NULL, uses zeros. Must have shape
(1, ...) where ... matches input dimensions. Default: NULL.
(integer)
Index or indices of output nodes. Default: NULL (all outputs).
(integer(1))
Number of steps for approximating the integral. Default: 50.
(logical(1))
If TRUE, multiplies integrated gradients by (data - x_ref).
This is the standard Integrated Gradients formulation. Default: TRUE.
(character(1))
Data type: "float" or "double". Default: "float".
(logical(1))
If TRUE, returns a InterpretingMethod object with
methods like plot() and get_result(). If FALSE (default), returns
a raw torch_tensor.
If return_object = FALSE (default): A torch_tensor
containing the integrated gradients with shape (batch_size, ..., n_outputs).
If return_object = TRUE: A InterpretingMethod object.
Integrated Gradients calculates feature importance by integrating gradients along the path from a baseline \(x'\) to the input \(x\):
$$(x - x') \times \int_{\alpha=0}^{1} \frac{\partial f(x' + \alpha (x - x'))}{\partial x} d\alpha$$
The integral is approximated using n interpolation steps.
M. Sundararajan et al. (2017) Axiomatic attribution for deep networks. ICML 2017, PMLR 70, pp. 3319-3328.
run_intgrad, IntegratedGradient
Other direct torch methods:
torch_expgrad(),
torch_grad(),
torch_smoothgrad()
library(torch)
model <- nn_sequential(nn_linear(10, 3))
data <- torch_randn(5, 10)
# Use zero baseline (default)
int_grads <- torch_intgrad(model, data)
# Use custom baseline
baseline <- torch_zeros(1, 10)
int_grads <- torch_intgrad(model, data, x_ref = baseline)
# More integration steps for higher accuracy
int_grads <- torch_intgrad(model, data, n = 100)