# Slide Classifier API¶

## Class Cluster Scikit¶

class lecture2notes.models.slide_classifier.class_cluster_scikit.Cluster(algorithm_name='kmeans', num_centroids=20, preference=None, damping=0.5, max_iter=200)[source]

Adds a filename and its coresponding feature vector to the cluster object

calculate_best_k(max_k=50)[source]

Implements elbow method to graph the cost (squared error) as a function of the number of centroids (value of k) The point at which the graph becomes essentially linear is the optimal value of k. Only works if algorithm is “kmeans”.

create_affinity_propagation(preference, damping, max_iter, store=True)[source]

Create and fit an affinity propagation cluster

create_algorithm_if_none()[source]

Creates algorithm if it has not been created (if it equals None) based on algorithm_name set in __init__

create_kmeans(num_centroids, store=True)[source]

Create and fit a kmeans cluster

get_closest_sample_filenames_to_centroids()[source]

Return the sample indexes that are closest to each centroid. Ex: If [0,8] is returned then X[0] (X is training data/vectors) is the closest point in X to centroid 0 and X[8] is the closest to centroid 1

get_labels()[source]
get_move_list()[source]

Creates a dictionary of file names and their coresponding centroid numbers

get_num_clusters()[source]
get_vector_array()[source]

Return a numpy array of the list of vectors stored in self.vectors

get_vectors()[source]
predict(array)[source]

Wrapper function for algorithm.predict. Creates algorithm if it has not been created.

visualize(tensorboard_dir)[source]

Creates tensorboard projection of cluster for simplified viewing and understanding

## Custom nn.Modules¶

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

forward(image)[source]
generate()[source]

“Striving for Simplicity: the All Convolutional Net” https://arxiv.org/pdf/1412.6806.pdf Look at Figure 1 on page 8.

“Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization” https://arxiv.org/pdf/1610.02391.pdf Look at Figure 2 on page 4

generate(target_layer)[source]

“Striving for Simplicity: the All Convolutional Net” https://arxiv.org/pdf/1412.6806.pdf Look at Figure 1 on page 8.

Visualize model responses given multiple images

lecture2notes.models.slide_classifier.grad_cam.occlusion_sensitivity(model, images, ids, mean=None, patch=35, stride=1, n_batches=128)[source]

“Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization” https://arxiv.org/pdf/1610.02391.pdf Look at Figure A5 on page 17

Originally proposed in: “Visualizing and Understanding Convolutional Networks” https://arxiv.org/abs/1311.2901

## Learning Rate Finder¶

class lecture2notes.models.slide_classifier.lr_finder.ExponentialLR(optimizer, end_lr, num_iter, last_epoch=- 1)[source]

Exponentially increases the learning rate between two boundaries over a number of iterations.

Parameters
• optimizer (torch.optim.Optimizer) – wrapped optimizer.

• end_lr (float, optional) – the initial learning rate which is the lower boundary of the test. Default: 10.

• num_iter (int, optional) – the number of iterations over which the test occurs. Default: 100.

• last_epoch (int) – the index of last epoch. Default: -1.

get_lr()[source]
class lecture2notes.models.slide_classifier.lr_finder.LRFinder(model, optimizer, criterion, device=None, memory_cache=True, cache_dir=None)[source]

Learning rate range test.

The learning rate range test increases the learning rate in a pre-training run between two boundaries in a linear or exponential manner. It provides valuable information on how well the network can be trained over a range of learning rates and what is the optimal learning rate.

Parameters
• model (torch.nn.Module) – wrapped model.

• optimizer (torch.optim.Optimizer) – wrapped optimizer where the defined learning is assumed to be the lower boundary of the range test.

• criterion (torch.nn.Module) – wrapped loss function.

• device (str or torch.device, optional) – a string (“cpu” or “cuda”) with an optional ordinal for the device type (e.g. “cuda:X”, where is the ordinal). Alternatively, can be an object representing the device on which the computation will take place. Default: None, uses the same device as model.

• memory_cache (boolean) – if this flag is set to True, state_dict of model and optimizer will be cached in memory. Otherwise, they will be saved to files under the cache_dir.

• cache_dir (string) – path for storing temporary files. If no path is specified, system-wide temporary directory is used. Notice that this parameter will be ignored if memory_cache is True.

Example

>>> lr_finder = LRFinder(net, optimizer, criterion, device="cuda")


Cyclical Learning Rates for Training Neural Networks: https://arxiv.org/abs/1506.01186 fastai/lr_find: https://github.com/fastai/fastai

plot(skip_start=10, skip_end=5, log_lr=True)[source]

Plots the learning rate range test.

Parameters
• skip_start (int, optional) – number of batches to trim from the start. Default: 10.

• skip_end (int, optional) – number of batches to trim from the start. Default: 5.

• log_lr (bool, optional) – True to plot the learning rate in a logarithmic scale; otherwise, plotted in a linear scale. Default: True.

Performs the learning rate range test.

Parameters

• val_loader (torch.utils.data.DataLoader, optional) – if None the range test will only use the training loss. When given a data loader, the model is evaluated after each iteration on that dataset and the evaluation loss is used. Note that in this mode the test takes significantly longer but generally produces more precise results. Default: None.

• end_lr (float, optional) – the maximum learning rate to test. Default: 10.

• num_iter (int, optional) – the number of iterations over which the test occurs. Default: 100.

• step_mode (str, optional) – one of the available learning rate policies, linear or exponential (“linear”, “exp”). Default: “exp”.

• smooth_f (float, optional) – the loss smoothing factor within the [0, 1[ interval. Disabled if set to 0, otherwise the loss is smoothed using exponential smoothing. Default: 0.05.

• diverge_th (int, optional) – the test is stopped when the loss surpasses the threshold: diverge_th * best_loss. Default: 5.

reset()[source]

Restores the model and optimizer to their initial states.

class lecture2notes.models.slide_classifier.lr_finder.LinearLR(optimizer, end_lr, num_iter, last_epoch=- 1)[source]

Linearly increases the learning rate between two boundaries over a number of iterations.

Parameters
• optimizer (torch.optim.Optimizer) – wrapped optimizer.

• end_lr (float, optional) – the initial learning rate which is the lower boundary of the test. Default: 10.

• num_iter (int, optional) – the number of iterations over which the test occurs. Default: 100.

• last_epoch (int) – the index of last epoch. Default: -1.

get_lr()[source]
class lecture2notes.models.slide_classifier.lr_finder.StateCacher(in_memory, cache_dir=None)[source]
retrieve(key)[source]
store(key, state_dict)[source]

## Mish¶

lecture2notes.models.slide_classifier.mish.f_mish(input, inplace=False)[source]

Applies the mish function element-wise: $$mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + exp(x)))$$

class lecture2notes.models.slide_classifier.mish.mish(inplace=False)[source]

Applies the mish function element-wise: $$mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + exp(x)))$$

Shape:
• Input: (N, *) where * means, any number of additional dimensions

• Output: (N, *), same shape as the input

Examples

>>> m = mish()
>>> input = torch.randn(2)
>>> output = m(input)

forward(input)[source]

Forward pass of the function.

training: bool

## Inference¶

lecture2notes.models.slide_classifier.inference.get_prediction(model, image, percent=False, extract_features=True)[source]
lecture2notes.models.slide_classifier.inference.initialize_model(arch, num_classes)[source]

Load saved model trained using old script (“.pth.tar” file extension is old format).

lecture2notes.models.slide_classifier.inference.transform_image(image, input_size=224)[source]

## Slide Classifier Helpers¶

lecture2notes.models.slide_classifier.slide_classifier_helpers.convert_relu_to_mish(model)[source]

Find all of the nn.ReLU activation functions in model and replace them with mish.

lecture2notes.models.slide_classifier.slide_classifier_helpers.plot_confusion_matrix(y_pred, y_true, classes, normalize=False, title='Confusion Matrix', cmap=<matplotlib.colors.LinearSegmentedColormap object>, save_path=None)[source]

This function prints and plots the confusion matrix. Normalization can be applied by setting normalize=True https://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html#sphx-glr-auto-examples-model-selection-plot-confusion-matrix-py.

## Slide Classifier PyTorch¶

class lecture2notes.models.slide_classifier.slide_classifier_pytorch.SlideClassifier(hparams)[source]

The main slide classifier model code.

static calculate_stats(output, target)[source]

Used for the training, validation, and testing steps to calculate various statistics.

Parameters
• output (torch.tensor) – the output from the model

• target (torch.tensor) – the ground-truth target classes

Returns

a tuple of tensors in the form (accuracy, precision, recall, f_score)

Return type

[tuple]

configure_optimizers()[source]

Create the optimizers and schedulers.

forward(*args, **kwargs)[source]

Passes *args and **kwargs to self.classification_model since SlideClassifier is a wrapper for the classification model.

get_input_size()[source]

Uses the hparams.arch to return the image input size to the model.

initialize_model(num_classes)[source]

Create the classification model. Modifies the standard models by adding extra layers to improve performance if feature_extract is advanced.

Parameters

num_classes (int) – the number of classes in the data (number of output features)

Returns

the modified pytorch model processed by the configuration options specified

Return type

[pytorch model]

prepare_data()[source]

Creates the PyTorch Datasets using datasets.ImageFolder and applying appropriate tranforms. If hparams.use_random_split is True then the dataset will be randomly split 80% for training and 20% for testing. If hparams.use_random_split is True then the dataset folder should contain a folder for each class. If it is False then there should be a folder for each split (named “train” and “val”) where each split folder contains a folder for each class. ImageFolder Documentation

This function will also run :meth:~slide_classifier_pytorch.SlideClassifier.initialize_model with len(self.hparams.classes)) as the num_classes argument if the classification model as not already been initialized in the __init__ function.

This helper function sets the .requires_grad attribute of the parameters in the model to False when we are feature extracting. By default, when we load a pretrained model all of the parameters have .requires_grad=True, which is fine if we are training from scratch or finetuning. However, if we are feature extracting and only want to compute gradients for the newly initialized layer then we want all of the other parameters to not require gradients.

Return the validation dataloader. The test process uses the same data as validation but calculates a classification report and displays a confusion matrix.

test_epoch_end(outputs)[source]

Create confusion matrix and calculate a sklearn classification report.

test_step(batch, batch_idx)[source]

Perform a test step. See the PyTorch Lightning documentation for test_step for more info.

training: bool
training_step(batch, batch_idx)[source]

Perform a training step. See the PyTorch Lightning Docs for more info.