This class analyzes a passed neural network and stores its internal
structure and the individual layers by converting the entire network into an
nn_module
. With the help of this converter, many
methods for interpreting the behavior of neural networks are provided, which
give a better understanding of the whole model or individual predictions.
You can use models from the following libraries:
torch
(nn_sequential
)
Furthermore, a model can be passed as a list (see
vignette("detailed_overview", package = "innsight")
or the
website).
The R6 class can also be initialized using the convert
function
as a helper function so that no prior knowledge of R6 classes is required.
In order to better understand and analyze the prediction of a neural
network, the preactivation or other information of the individual layers,
which are not stored in an ordinary forward pass, are often required. For
this reason, a given neural network is converted into a torch-based neural
network, which provides all the necessary information for an interpretation.
The converted torch model is stored in the field model
and is an instance
of ConvertedModel
.
However, before the torch model is created, all relevant details of the
passed model are extracted into a named list. This list can be saved
in complete form in the model_as_list
field with the argument
save_model_as_list
, but this may consume a lot of memory for large
networks and is not done by default. Also, this named list can again be
used as a passed model for the class Converter
, which will be described
in more detail in the section 'Implemented Libraries'.
An object of the Converter class can be applied to the following methods:
Layerwise Relevance Propagation (LRP), Bach et al. (2015)
Deep Learning Important Features (DeepLift), Shrikumar et al. (2017)
DeepSHAP, Lundberg et al. (2017)
SmoothGrad including SmoothGrad\(\times\)Input, Smilkov et al. (2017)
Vanilla Gradient including Gradient\(\times\)Input
Integrated gradients (IntegratedGradient), Sundararajan et al. (2017)
Expected gradients (ExpectedGradient), Erion et al. (2021)
ConnectionWeights, Olden et al. (2004)
Local interpretable model-agnostic explanation (LIME), Ribeiro et al. (2016)
Shapley values (SHAP), Lundberg et al. (2017)
The converter is implemented for models from the libraries
nn_sequential
,
neuralnet
and keras
. But you
can also write a wrapper for other libraries because a model can be passed
as a named list which is described in detail in the vignette "In-depth
explanation"
(see vignette("detailed_overview", package = "innsight")
or the
website).
J. D. Olden et al. (2004) An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data. Ecological Modelling 178, p. 389–397
S. Bach et al. (2015) On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10, p. 1-46
M. T. Ribeiro et al. (2016) "Why should I trust you?": Explaining the predictions of any classifier. KDD 2016, p. 1135-1144
A. Shrikumar et al. (2017) Learning important features through propagating activation differences. ICML 2017, p. 4844-4866
D. Smilkov et al. (2017) SmoothGrad: removing noise by adding noise. CoRR, abs/1706.03825 M. Sundararajan et al. (2017) Axiomatic attribution for deep networks. ICML 2017, p.3319-3328
S. Lundberg et al. (2017) A unified approach to interpreting model predictions. NIPS 2017, p. 4768-4777
G. Erion et al. (2021) Improving performance of deep learning models with axiomatic attribution priors and expected gradients. Nature Machine Intelligence 3, p. 620-631
model
(ConvertedModel
)
The converted neural network based on the torch module ConvertedModel.
input_dim
(list
)
A list of the input dimensions of each input layer. Since
internally the "channels first" format is used for all calculations, the
input shapes are already in this format. In addition, the batch
dimension isn't included, e.g., for an input layer of shape c(*,32,32,3)
with channels in the last axis you get list(c(3,32,32))
.
input_names
(list
)
A list with the names as factors for each input
dimension of the shape as stored in the field input_dim
.
output_dim
(list
)
A list of the output dimensions of each output layer.
output_names
(list
)
A list with the names as factors for each
output dimension of shape as stored in the field output_dim
.
model_as_list
(list
)
The model stored in a named list (see details for more
information). By default, the entry model_as_list$layers
is deleted
because it may require a lot of memory for large networks. However, with
the argument save_model_as_list
this can be saved anyway.
new()
Create a new Converter object for a given neural network. When initialized,
the model is inspected, converted as a list and then the a
torch-converted model (ConvertedModel) is created and stored in
the field model
.
Converter$new(
model,
input_dim = NULL,
input_names = NULL,
output_names = NULL,
dtype = "float",
save_model_as_list = FALSE
)
model
(nn_sequential
, keras_model
,
neuralnet
or list
)
A trained neural network for classification or regression
tasks to be interpreted. Only models from the following types or
packages are allowed: nn_sequential
,
keras_model
,
keras_model_sequential
,
neuralnet
or a named list (see details).
input_dim
(integer
or list
)
The model input dimension excluding the batch
dimension. If there is only one input layer it can be specified as
a vector, otherwise use a list of the shapes of the
individual input layers.
Note: This argument is only necessary for torch::nn_sequential
,
for all others it is automatically extracted from the passed model
and used for internal checks. In addition, the input dimension
input_dim
has to be in the format "channels first".
input_names
(character
, factor
or list
)
The input names of the model excluding the batch dimension. For a model
with a single input layer and input axis (e.g., for tabular data), the
input names can be specified as a character vector or factor, e.g.,
for a dense layer with 3 input features use c("X1", "X2", "X3")
. If
the model input consists of multiple axes (e.g., for signal and
image data), use a list of character vectors or factors for each axis
in the format "channels first", e.g., use
list(c("C1", "C2"), c("L1","L2","L3","L4","L5"))
for a 1D
convolutional input layer with signal length 4 and 2 channels. For
models with multiple input layers, use a list of the upper ones for each
layer.
Note: This argument is optional and otherwise the names are
generated automatically. But if this argument is set, all found
input names in the passed model will be disregarded.
output_names
(character
, factor
or list
)
A character vector with the names for the output dimensions
excluding the batch dimension, e.g., for a model with 3 output nodes use
c("Y1", "Y2", "Y3")
. Instead of a character
vector you can also use a factor to set an order for the plots. If the
model has multiple output layers, use a list of the upper ones.
Note: This argument is optional and otherwise the names are
generated automatically. But if this argument is set, all found
output names in the passed model will be disregarded.
dtype
(character(1)
)
The data type for the calculations. Use
either 'float'
for torch::torch_float or 'double'
for
torch::torch_double.
save_model_as_list
(logical(1)
)
This logical value specifies whether the
passed model should be stored as a list. This list can take
a lot of memory for large networks, so by default the model is not
stored as a list (FALSE
).
print()
Print a summary of the Converter
object. This summary contains the
individual fields and in particular the torch-converted model
(ConvertedModel) with the layers.
Returns the Converter
object invisibly via base::invisible
.
#----------------------- Example 1: Torch ----------------------------------
library(torch)
model <- nn_sequential(
nn_linear(5, 10),
nn_relu(),
nn_linear(10, 2, bias = FALSE),
nn_softmax(dim = 2)
)
data <- torch_randn(25, 5)
# Convert the model (for torch models is 'input_dim' required!)
converter <- Converter$new(model, input_dim = c(5))
# You can also use the helper function `convert()` for initializing a
# Converter object
converter <- convert(model, input_dim = c(5))
# Get the converted model stored in the field 'model'
converted_model <- converter$model
# Test it with the original model
mean(abs(converted_model(data)[[1]] - model(data)))
#> torch_tensor
#> 0
#> [ CPUFloatType{} ][ grad_fn = <MeanBackward0> ]
#----------------------- Example 2: Neuralnet ------------------------------
if (require("neuralnet")) {
library(neuralnet)
data(iris)
# Train a neural network
nn <- neuralnet((Species == "setosa") ~ Petal.Length + Petal.Width,
iris,
linear.output = FALSE,
hidden = c(3, 2), act.fct = "tanh", rep = 1
)
# Convert the model
converter <- convert(nn)
# Print all the layers
converter$model$modules_list
}
#> $Dense_1
#> An `nn_module` containing 0 parameters.
#>
#> ── Modules ─────────────────────────────────────────────────────────────────────
#> • activation_f: <nn_tanh> #0 parameters
#>
#> $Dense_2
#> An `nn_module` containing 0 parameters.
#>
#> ── Modules ─────────────────────────────────────────────────────────────────────
#> • activation_f: <nn_tanh> #0 parameters
#>
#> $Dense_3
#> An `nn_module` containing 0 parameters.
#>
#> ── Modules ─────────────────────────────────────────────────────────────────────
#> • activation_f: <nn_tanh> #0 parameters
#>
#----------------------- Example 3: Keras ----------------------------------
if (require("keras") & keras::is_keras_available()) {
library(keras)
# Make sure keras is installed properly
is_keras_available()
# Define a keras model
model <- keras_model_sequential() %>%
layer_conv_2d(
input_shape = c(32, 32, 3), kernel_size = 8, filters = 8,
activation = "relu", padding = "same") %>%
layer_conv_2d(
kernel_size = 8, filters = 4,
activation = "tanh", padding = "same") %>%
layer_conv_2d(
kernel_size = 4, filters = 2,
activation = "relu", padding = "same") %>%
layer_flatten() %>%
layer_dense(units = 64, activation = "relu") %>%
layer_dense(units = 1, activation = "sigmoid")
# Convert this model and save model as list
converter <- convert(model, save_model_as_list = TRUE)
# Print the converted model as a named list
str(converter$model_as_list, max.level = 1)
}
#> List of 7
#> $ input_dim :List of 1
#> $ input_nodes : int 1
#> $ output_dim :List of 1
#> $ output_nodes: int 6
#> $ layers :List of 6
#> $ input_names :List of 1
#> $ output_names:List of 1
#----------------------- Example 4: List ----------------------------------
# Define a model
model <- list()
model$input_dim <- 5
model$input_names <- list(c("Feat1", "Feat2", "Feat3", "Feat4", "Feat5"))
model$input_nodes <- c(1)
model$output_dim <- 2
model$output_names <- list(c("Cat", "no-Cat"))
model$output_nodes <- c(2)
model$layers$Layer_1 <-
list(
type = "Dense",
weight = matrix(rnorm(5 * 20), 20, 5),
bias = rnorm(20),
activation_name = "tanh",
dim_in = 5,
dim_out = 20,
input_layers = 0, # '0' means model input layer
output_layers = 2
)
model$layers$Layer_2 <-
list(
type = "Dense",
weight = matrix(rnorm(20 * 2), 2, 20),
bias = rnorm(2),
activation_name = "softmax",
input_layers = 1,
output_layers = -1 # '-1' means model output layer
#dim_in = 20, # These values are optional, but
#dim_out = 2 # useful for internal checks
)
# Convert the model
converter <- convert(model)
# Get the model as a torch::nn_module
torch_model <- converter$model
# You can use it as a normal torch model
x <- torch::torch_randn(3, 5)
torch_model(x)
#> [[1]]
#> torch_tensor
#> 0.2051 0.7949
#> 0.0769 0.9231
#> 0.0784 0.9216
#> [ CPUFloatType{3,2} ]
#>