Utilities¶
Visualization¶

captum.attr.visualization.
visualize_image_attr
(attr, original_image=None, method='heat_map', sign='absolute_value', plt_fig_axis=None, outlier_perc=2, cmap=None, alpha_overlay=0.5, show_colorbar=False, title=None, fig_size=(6, 6), use_pyplot=True)¶ Visualizes attribution for a given image by normalizing attribution values of the desired sign (positive, negative, absolute value, or all) and displaying them using the desired mode in a matplotlib figure.
 Parameters
attr (numpy.array) – Numpy array corresponding to attributions to be visualized. Shape must be in the form (H, W, C), with channels as last dimension. Shape must also match that of the original image if provided.
original_image (numpy.array, optional) – Numpy array corresponding to original image. Shape must be in the form (H, W, C), with channels as the last dimension. Image can be provided either with float values in range 01 or int values between 0255. This is a necessary argument for any visualization method which utilizes the original image. Default: None
method (string, optional) –
Chosen method for visualizing attribution. Supported options are:
heat_map  Display heat map of chosen attributions
 blended_heat_map  Overlay heat map over greyscale
version of original image. Parameter alpha_overlay corresponds to alpha of heat map.
original_image  Only display original image.
 masked_image  Mask image (pixelwise multiply)
by normalized attribution values.
5. alpha_scaling  Sets alpha channel of each pixel to be equal to normalized attribution value.
Default: heat_map
sign (string, optional) –
 Chosen sign of attributions to visualize. Supported
options are: 1. positive  Displays only positive pixel attributions. 2. absolute_value  Displays absolute value of
attributions.
negative  Displays only negative pixel attributions.
 all  Displays both positive and negative attribution
values. This is not supported for masked_image or alpha_scaling modes, since signed information cannot be represented in these modes.
Default: absolute_value
plt_fig_axis (tuple, optional) – Tuple of matplotlib.pyplot.figure and axis on which to visualize. If None is provided, then a new figure and axis are created. Default: None
outlier_perc (float, optional) – Top attribution values which correspond to a total of outlier_perc percentage of the total attribution are set to 1 and scaling is performed using the minimum of these values. For sign=`all`, outliers and scale value are computed using absolute value of attributions. Default: 2
cmap (string, optional) – String corresponding to desired colormap for heatmap visualization. This defaults to “Reds” for negative sign, “Blues” for absolute value, “Greens” for positive sign, and a spectrum from red to green for all. Note that this argument is only used for visualizations displaying heatmaps. Default: None
alpha_overlay (float, optional) – Alpha to set for heatmap when using blended_heat_map visualization mode, which overlays the heat map over the greyscaled original image. Default: 0.5
show_colorbar (boolean, optional) – Displays colorbar for heatmap below the visualization. If given method does not use a heatmap, then a colormap axis is created and hidden. This is necessary for appropriate alignment when visualizing multiple plots, some with colorbars and some without. Default: False
title (string, optional) – Title string for plot. If None, no title is set. Default: None
fig_size (tuple, optional) – Size of figure created. Default: (6,6)
use_pyplot (boolean, optional) – If true, uses pyplot to create and show figure and displays the figure after creating. If False, uses Matplotlib object oriented API and simply returns a figure object without showing. Default: True.
 Returns
 figure (matplotlib.pyplot.figure):
Figure object on which visualization is created. If plt_fig_axis argument is given, this is the same figure provided.
 axis (matplotlib.pyplot.axis):
Axis object on which visualization is created. If plt_fig_axis argument is given, this is the same axis provided.
 Return type
2element tuple of figure, axis
Examples:
>>> # ImageClassifier takes a single input tensor of images Nx3x32x32, >>> # and returns an Nx10 tensor of class probabilities. >>> net = ImageClassifier() >>> ig = IntegratedGradients(net) >>> # Computes integrated gradients for class 3 for a given image . >>> attribution, delta = ig.attribute(orig_image, target=3) >>> # Displays blended heat map visualization of computed attributions. >>> _ = visualize_image_attr(attribution, orig_image, "blended_heat_map")

captum.attr.visualization.
visualize_image_attr_multiple
(attr, original_image, methods, signs, titles=None, fig_size=(8, 6), use_pyplot=True, **kwargs)¶ Visualizes attribution using multiple visualization methods displayed in a 1 x k grid, where k is the number of desired visualizations.
 Parameters
attr (numpy.array) – Numpy array corresponding to attributions to be visualized. Shape must be in the form (H, W, C), with channels as last dimension. Shape must also match that of the original image if provided.
original_image (numpy.array, optional) – Numpy array corresponding to original image. Shape must be in the form (H, W, C), with channels as the last dimension. Image can be provided either with values in range 01 or 0255. This is a necessary argument for any visualization method which utilizes the original image.
methods (list of strings) – List of strings of length k, defining method for each visualization. Each method must be a valid string argument for method to visualize_image_attr.
signs (list of strings) – List of strings of length k, defining signs for each visualization. Each sign must be a valid string argument for sign to visualize_image_attr.
titles (list of strings, optional) – List of strings of length k, providing a title string for each plot. If None is provided, no titles are added to subplots. Default: None
fig_size (tuple, optional) – Size of figure created. Default: (8, 6)
use_pyplot (boolean, optional) – If true, uses pyplot to create and show figure and displays the figure after creating. If False, uses Matplotlib object oriented API and simply returns a figure object without showing. Default: True.
**kwargs (Any, optional) – Any additional arguments which will be passed to every individual visualization. Such arguments include show_colorbar, alpha_overlay, cmap, etc.
 Returns
 figure (matplotlib.pyplot.figure):
Figure object on which visualization is created. If plt_fig_axis argument is given, this is the same figure provided.
 axis (matplotlib.pyplot.axis):
Axis object on which visualization is created. If plt_fig_axis argument is given, this is the same axis provided.
 Return type
2element tuple of figure, axis
Examples:
>>> # ImageClassifier takes a single input tensor of images Nx3x32x32, >>> # and returns an Nx10 tensor of class probabilities. >>> net = ImageClassifier() >>> ig = IntegratedGradients(net) >>> # Computes integrated gradients for class 3 for a given image . >>> attribution, delta = ig.attribute(orig_image, target=3) >>> # Displays original image and heat map visualization of >>> # computed attributions side by side. >>> _ = visualize_mutliple_image_attr(["original_image", "heat_map"], >>> ["all", "positive"], attribution, orig_image)
Interpretable Embeddings¶

class
captum.attr.
InterpretableEmbeddingBase
(embedding, full_name)[source]¶ Since some embedding vectors, e.g. word are created and assigned in the embedding layers of Pytorch models we need a way to access those layers, generate the embeddings and subtract the baseline. To do so, we separate embedding layers from the model, compute the embeddings separately and do all operations needed outside of the model. The original embedding layer is being replaced by InterpretableEmbeddingBase layer which passes already precomputed embedding vectors to the layers below.

forward
(*inputs, **kwargs)[source]¶ The forward function of a wrapper embedding layer that takes and returns embedding layer. It allows embeddings to be created outside of the model and passes them seamlessly to the preceding layers of the model.
 Parameters
*inputs (Any, optional) – A sequence of inputs arguments that the forward function takes. Since forward functions can take any type and number of arguments, this will ensure that we can execute the forward pass using interpretable embedding layer. Note that if inputs are specified, it is assumed that the first argument is the embedding tensor generated using the self.embedding layer using all input arguments provided in inputs and kwargs.
**kwargs (Any, optional) – Similar to inputs we want to make sure that our forward pass supports arbitrary number and type of keyvalue arguments. If inputs is not provided, kwargs must be provided and the first argument corresponds to the embedding tensor generated using the self.embedding. Note that we make here an assumption here that kwargs is an ordered dict which is new in python 3.6 and is not guaranteed that it will consistently remain that way in the newer versions. In case current implementation doesn’t work for special use cases, it is encouraged to override InterpretableEmbeddingBase and address those specifics in descendant classes.
 Returns
Returns a tensor which is the same as first argument passed to the forward function. It passes precomputed embedding tensors to lower layers without any modifications.
 Return type
embedding_tensor (Tensor)

indices_to_embeddings
(*input, **kwargs)[source]¶ Maps indices to corresponding embedding vectors. E.g. word embeddings
 Parameters
*input (Any, Optional) – This can be a tensor(s) of input indices or any other variable necessary to comput the embeddings. A typical example of input indices are word or token indices.
**kwargs (Any, optional) – Similar to input this can be any sequence of keyvalue arguments necessary to compute final embedding tensor.
 Returns
A tensor of word embeddings corresponding to the indices specified in the input
 Return type
tensor


captum.attr.
configure_interpretable_embedding_layer
(model, embedding_layer_name='embedding')[source]¶ This method wraps model’s embedding layer with an interpretable embedding layer that allows us to access the embeddings through their indices.
 Parameters
model (torch.nn.Model) – An instance of PyTorch model that contains embeddings.
embedding_layer_name (str, optional) – The name of the embedding layer in the model that we would like to make interpretable.
 Returns
 An instance of InterpretableEmbeddingBase
embedding layer that wraps model’s embedding layer that is being accessed through embedding_layer_name.
 Return type
interpretable_emb (tensor)
Examples:
>>> # Let's assume that we have a DocumentClassifier model that >>> # has a word embedding layer named 'embedding'. >>> # To make that layer interpretable we need to execute the >>> # following command: >>> net = DocumentClassifier() >>> interpretable_emb = configure_interpretable_embedding_layer(net, >>> 'embedding') >>> # then we can use interpretable embedding to convert our >>> # word indices into embeddings. >>> # Let's assume that we have the following word indices >>> input_indices = torch.tensor([1, 0, 2]) >>> # we can access word embeddings for those indices with the command >>> # line stated below. >>> input_emb = interpretable_emb.indices_to_embeddings(input_indices) >>> # Let's assume that we want to apply integrated gradients to >>> # our model and that target attribution class is 3 >>> ig = IntegratedGradients(net) >>> attribution = ig.attribute(input_emb, target=3) >>> # after we finish the interpretation we need to remove >>> # interpretable embedding layer with the following command: >>> remove_interpretable_embedding_layer(net, interpretable_emb)

captum.attr.
remove_interpretable_embedding_layer
(model, interpretable_emb)[source]¶ Removes interpretable embedding layer and sets back original embedding layer in the model.
 Parameters
model (torch.nn.Module) – An instance of PyTorch model that contains embeddings
interpretable_emb (tensor) – An instance of InterpretableEmbeddingBase that was originally created in configure_interpretable_embedding_layer function and has to be removed after interpretation is finished.
Examples:
>>> # Let's assume that we have a DocumentClassifier model that >>> # has a word embedding layer named 'embedding'. >>> # To make that layer interpretable we need to execute the >>> # following command: >>> net = DocumentClassifier() >>> interpretable_emb = configure_interpretable_embedding_layer(net, >>> 'embedding') >>> # then we can use interpretable embedding to convert our >>> # word indices into embeddings. >>> # Let's assume that we have the following word indices >>> input_indices = torch.tensor([1, 0, 2]) >>> # we can access word embeddings for those indices with the command >>> # line stated below. >>> input_emb = interpretable_emb.indices_to_embeddings(input_indices) >>> # Let's assume that we want to apply integrated gradients to >>> # our model and that target attribution class is 3 >>> ig = IntegratedGradients(net) >>> attribution = ig.attribute(input_emb, target=3) >>> # after we finish the interpretation we need to remove >>> # interpretable embedding layer with the following command: >>> remove_interpretable_embedding_layer(net, interpretable_emb)
Token Reference Base¶

class
captum.attr.
TokenReferenceBase
(reference_token_idx=0)[source]¶ A base class for creating reference (aka baseline) tensor for a sequence of tokens. A typical example of such token is PAD. Users need to provide the index of the reference token in the vocabulary as an argument to TokenReferenceBase class.

generate_reference
(sequence_length, device)[source]¶ Generated reference tensor of given sequence_length using reference_token_idx.
 Parameters
sequence_length (int) – The length of the reference sequence
device (torch.device) – The device on which the reference tensor will be created.
 Returns
 A sequence of reference token with shape:
[sequence_length]
 Return type
tensor
