Direct gradient calculation for torch models

Calculate gradients of model outputs with respect to inputs using native torch autograd. This function provides a lightweight alternative to run_grad and Gradient that works directly with any torch::nn_module objects without requiring model conversion.

torch_grad(
  model,
  data,
  output_idx = NULL,
  times_input = FALSE,
  dtype = "float",
  return_object = FALSE
)

Arguments

model: (nn_module)
A torch model. Must be an instance of nn_module.
data: (torch_tensor, array, or matrix)
Input data for which to calculate gradients. If not already a torch tensor, it will be converted automatically. Expected shape: (batch_size, ...) where ... represents the input dimensions.
output_idx: (integer)
Index or indices of output nodes for which to calculate gradients. If the model outputs a tensor of shape (batch_size, n_outputs), use indices 1 to n_outputs. Default: NULL (all outputs).
times_input: (logical(1))
If TRUE, multiplies the gradients by the input values (Gradient×Input method). Default: FALSE (Vanilla Gradient).
dtype: (character(1))
Data type for calculations. Either "float" for torch_float or "double" for torch_double. Default: "float".
return_object: (logical(1))
If TRUE, returns a InterpretingMethod object with methods like plot() and get_result(). If FALSE (default), returns a raw torch_tensor.

Value

If return_object = FALSE (default): A torch_tensor containing the gradients with shape (batch_size, ..., n_outputs). If return_object = TRUE: A InterpretingMethod object.

Details

This function computes the gradients of the outputs with respect to the input variables, i.e., for all input variable $i$ and output class $j$: $$d f(x)_j / d x_i$$

If times_input = TRUE, the gradients are multiplied by the respective input value (Gradient×Input): $$x_i * d f(x)_j / d x_i$$

While vanilla gradients emphasize prediction-sensitive features, Gradient×Input provides a decomposition of the output into feature-wise effects based on the first-order Taylor decomposition.

Performance

This function is typically faster and more memory-efficient than run_grad because it:

Avoids model conversion overhead
Uses native torch autograd directly

Comparison with run_grad

torch_grad is recommended when:

Working with non-sequential torch models
Working with torch models exclusively
Performance is critical

Use run_grad when:

Working with keras or neuralnet models
You need other attribution methods (LRP, DeepLift, etc.)

Examples

library(torch)

# Create a simple model
model <- nn_sequential(
  nn_linear(10, 50),
  nn_relu(),
  nn_linear(50, 3)
)

# Generate some data
data <- torch_randn(5, 10)

# Calculate vanilla gradients
grads <- torch_grad(model, data)

# Calculate Gradient×Input
grads_times_input <- torch_grad(model, data, times_input = TRUE)

# Calculate gradients for specific output
grads_class1 <- torch_grad(model, data, output_idx = 1)

# Get result as innsight object with plot() support
result <- torch_grad(model, data, return_object = TRUE)
plot(result)