This class analyzes a passed neural network and stores its internal structure and the individual layers by converting the entire network into an nn_module. With the help of this converter, many methods for interpreting the behavior of neural networks are provided, which give a better understanding of the whole model or individual predictions. You can use models from the following libraries:

Furthermore, a model can be passed as a list (see vignette("detailed_overview", package = "innsight") or the website).

The R6 class can also be initialized using the convert function as a helper function so that no prior knowledge of R6 classes is required.


In order to better understand and analyze the prediction of a neural network, the preactivation or other information of the individual layers, which are not stored in an ordinary forward pass, are often required. For this reason, a given neural network is converted into a torch-based neural network, which provides all the necessary information for an interpretation. The converted torch model is stored in the field model and is an instance of ConvertedModel. However, before the torch model is created, all relevant details of the passed model are extracted into a named list. This list can be saved in complete form in the model_as_list field with the argument save_model_as_list, but this may consume a lot of memory for large networks and is not done by default. Also, this named list can again be used as a passed model for the class Converter, which will be described in more detail in the section 'Implemented Libraries'.

Implemented methods

An object of the Converter class can be applied to the following methods:

  • Layerwise Relevance Propagation (LRP), Bach et al. (2015)

  • Deep Learning Important Features (DeepLift), Shrikumar et al. (2017)

  • DeepSHAP, Lundberg et al. (2017)

  • SmoothGrad including SmoothGrad\(\times\)Input, Smilkov et al. (2017)

  • Vanilla Gradient including Gradient\(\times\)Input

  • Integrated gradients (IntegratedGradient), Sundararajan et al. (2017)

  • Expected gradients (ExpectedGradient), Erion et al. (2021)

  • ConnectionWeights, Olden et al. (2004)

  • Local interpretable model-agnostic explanation (LIME), Ribeiro et al. (2016)

  • Shapley values (SHAP), Lundberg et al. (2017)

Implemented libraries

The converter is implemented for models from the libraries nn_sequential, neuralnet and keras. But you can also write a wrapper for other libraries because a model can be passed as a named list which is described in detail in the vignette "In-depth explanation"
(see vignette("detailed_overview", package = "innsight") or the website).


  • J. D. Olden et al. (2004) An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data. Ecological Modelling 178, p. 389–397

  • S. Bach et al. (2015) On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10, p. 1-46

  • M. T. Ribeiro et al. (2016) "Why should I trust you?": Explaining the predictions of any classifier. KDD 2016, p. 1135-1144

  • A. Shrikumar et al. (2017) Learning important features through propagating activation differences. ICML 2017, p. 4844-4866

  • D. Smilkov et al. (2017) SmoothGrad: removing noise by adding noise. CoRR, abs/1706.03825 M. Sundararajan et al. (2017) Axiomatic attribution for deep networks. ICML 2017, p.3319-3328

  • S. Lundberg et al. (2017) A unified approach to interpreting model predictions. NIPS 2017, p. 4768-4777

  • G. Erion et al. (2021) Improving performance of deep learning models with axiomatic attribution priors and expected gradients. Nature Machine Intelligence 3, p. 620-631

Public fields


The converted neural network based on the torch module ConvertedModel.


A list of the input dimensions of each input layer. Since internally the "channels first" format is used for all calculations, the input shapes are already in this format. In addition, the batch dimension isn't included, e.g., for an input layer of shape c(*,32,32,3) with channels in the last axis you get list(c(3,32,32)).


A list with the names as factors for each input dimension of the shape as stored in the field input_dim.


A list of the output dimensions of each output layer.


A list with the names as factors for each output dimension of shape as stored in the field output_dim.


The model stored in a named list (see details for more information). By default, the entry model_as_list$layers is deleted because it may require a lot of memory for large networks. However, with the argument save_model_as_list this can be saved anyway.


Method new()

Create a new Converter object for a given neural network. When initialized, the model is inspected, converted as a list and then the a torch-converted model (ConvertedModel) is created and stored in the field model.


  input_dim = NULL,
  input_names = NULL,
  output_names = NULL,
  dtype = "float",
  save_model_as_list = FALSE



(nn_sequential, keras_model, neuralnet or list)
A trained neural network for classification or regression tasks to be interpreted. Only models from the following types or packages are allowed: nn_sequential, keras_model, keras_model_sequential, neuralnet or a named list (see details).


(integer or list)
The model input dimension excluding the batch dimension. If there is only one input layer it can be specified as a vector, otherwise use a list of the shapes of the individual input layers.
Note: This argument is only necessary for torch::nn_sequential, for all others it is automatically extracted from the passed model and used for internal checks. In addition, the input dimension input_dim has to be in the format "channels first".


(character, factor or list)
The input names of the model excluding the batch dimension. For a model with a single input layer and input axis (e.g., for tabular data), the input names can be specified as a character vector or factor, e.g., for a dense layer with 3 input features use c("X1", "X2", "X3"). If the model input consists of multiple axes (e.g., for signal and image data), use a list of character vectors or factors for each axis in the format "channels first", e.g., use list(c("C1", "C2"), c("L1","L2","L3","L4","L5")) for a 1D convolutional input layer with signal length 4 and 2 channels. For models with multiple input layers, use a list of the upper ones for each layer.
Note: This argument is optional and otherwise the names are generated automatically. But if this argument is set, all found input names in the passed model will be disregarded.


(character, factor or list)
A character vector with the names for the output dimensions excluding the batch dimension, e.g., for a model with 3 output nodes use c("Y1", "Y2", "Y3"). Instead of a character vector you can also use a factor to set an order for the plots. If the model has multiple output layers, use a list of the upper ones.
Note: This argument is optional and otherwise the names are generated automatically. But if this argument is set, all found output names in the passed model will be disregarded.


The data type for the calculations. Use either 'float' for torch::torch_float or 'double' for torch::torch_double.


This logical value specifies whether the passed model should be stored as a list. This list can take a lot of memory for large networks, so by default the model is not stored as a list (FALSE).


A new instance of the R6 class Converter.

Method print()

Print a summary of the Converter object. This summary contains the individual fields and in particular the torch-converted model (ConvertedModel) with the layers.




Returns the Converter object invisibly via base::invisible.

Method clone()

The objects of this class are cloneable with this method.


Converter$clone(deep = FALSE)



Whether to make a deep clone.


#----------------------- Example 1: Torch ----------------------------------

model <- nn_sequential(
  nn_linear(5, 10),
  nn_linear(10, 2, bias = FALSE),
  nn_softmax(dim = 2)
data <- torch_randn(25, 5)

# Convert the model (for torch models is 'input_dim' required!)
converter <- Converter$new(model, input_dim = c(5))

# You can also use the helper function `convert()` for initializing a
# Converter object
converter <- convert(model, input_dim = c(5))

# Get the converted model stored in the field 'model'
converted_model <- converter$model

# Test it with the original model
mean(abs(converted_model(data)[[1]] - model(data)))
#> torch_tensor
#> 0
#> [ CPUFloatType{} ][ grad_fn = <MeanBackward0> ]

#----------------------- Example 2: Neuralnet ------------------------------
if (require("neuralnet")) {

  # Train a neural network
  nn <- neuralnet((Species == "setosa") ~ Petal.Length + Petal.Width,
    linear.output = FALSE,
    hidden = c(3, 2), act.fct = "tanh", rep = 1

  # Convert the model
  converter <- convert(nn)

  # Print all the layers
#> $Dense_1
#> An `nn_module` containing 0 parameters.
#> ── Modules ─────────────────────────────────────────────────────────────────────
#> • activation_f: <nn_tanh> #0 parameters
#> $Dense_2
#> An `nn_module` containing 0 parameters.
#> ── Modules ─────────────────────────────────────────────────────────────────────
#> • activation_f: <nn_tanh> #0 parameters
#> $Dense_3
#> An `nn_module` containing 0 parameters.
#> ── Modules ─────────────────────────────────────────────────────────────────────
#> • activation_f: <nn_tanh> #0 parameters
#----------------------- Example 3: Keras ----------------------------------
if (require("keras") & keras::is_keras_available()) {

  # Make sure keras is installed properly

  # Define a keras model
  model <- keras_model_sequential() %>%
      input_shape = c(32, 32, 3), kernel_size = 8, filters = 8,
      activation = "relu", padding = "same") %>%
      kernel_size = 8, filters = 4,
      activation = "tanh", padding = "same") %>%
      kernel_size = 4, filters = 2,
      activation = "relu", padding = "same") %>%
    layer_flatten() %>%
    layer_dense(units = 64, activation = "relu") %>%
    layer_dense(units = 1, activation = "sigmoid")

  # Convert this model and save model as list
  converter <- convert(model, save_model_as_list = TRUE)

  # Print the converted model as a named list
  str(converter$model_as_list, max.level = 1)
#> List of 7
#>  $ input_dim   :List of 1
#>  $ input_nodes : int 1
#>  $ output_dim  :List of 1
#>  $ output_nodes: int 6
#>  $ layers      :List of 6
#>  $ input_names :List of 1
#>  $ output_names:List of 1
#----------------------- Example 4: List  ----------------------------------

# Define a model

model <- list()
model$input_dim <- 5
model$input_names <- list(c("Feat1", "Feat2", "Feat3", "Feat4", "Feat5"))
model$input_nodes <- c(1)
model$output_dim <- 2
model$output_names <- list(c("Cat", "no-Cat"))
model$output_nodes <- c(2)
model$layers$Layer_1 <-
    type = "Dense",
    weight = matrix(rnorm(5 * 20), 20, 5),
    bias = rnorm(20),
    activation_name = "tanh",
    dim_in = 5,
    dim_out = 20,
    input_layers = 0, # '0' means model input layer
    output_layers = 2
model$layers$Layer_2 <-
    type = "Dense",
    weight = matrix(rnorm(20 * 2), 2, 20),
    bias = rnorm(2),
    activation_name = "softmax",
    input_layers = 1,
    output_layers = -1 # '-1' means model output layer
    #dim_in = 20, # These values are optional, but
    #dim_out = 2  # useful for internal checks

# Convert the model
converter <- convert(model)

# Get the model as a torch::nn_module
torch_model <- converter$model

# You can use it as a normal torch model
x <- torch::torch_randn(3, 5)
#> [[1]]
#> torch_tensor
#>  0.2051  0.7949
#>  0.0769  0.9231
#>  0.0784  0.9216
#> [ CPUFloatType{3,2} ]