Slide Classifier API¶
Class Cluster Scikit¶
- class lecture2notes.models.slide_classifier.class_cluster_scikit.Cluster(algorithm_name='kmeans', num_centroids=20, preference=None, damping=0.5, max_iter=200)[source]¶
- add(vector, filename)[source]¶
Adds a filename and its coresponding feature vector to the cluster object
- calculate_best_k(max_k=50)[source]¶
Implements elbow method to graph the cost (squared error) as a function of the number of centroids (value of k) The point at which the graph becomes essentially linear is the optimal value of k. Only works if algorithm is “kmeans”.
- create_affinity_propagation(preference, damping, max_iter, store=True)[source]¶
Create and fit an affinity propagation cluster
- create_algorithm_if_none()[source]¶
Creates algorithm if it has not been created (if it equals None) based on algorithm_name set in __init__
- get_closest_sample_filenames_to_centroids()[source]¶
Return the sample indexes that are closest to each centroid. Ex: If [0,8] is returned then X[0] (X is training data/vectors) is the closest point in X to centroid 0 and X[8] is the closest to centroid 1
Custom nn.Modules¶
- class lecture2notes.models.slide_classifier.custom_nnmodules.AdaptiveConcatPool2d(sz=None)[source]¶
Layer that concats AdaptiveAvgPool2d and AdaptiveMaxPool2d https://docs.fast.ai/layers.html#AdaptiveConcatPool2d
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
GradCAM¶
- class lecture2notes.models.slide_classifier.grad_cam.Deconvnet(model)[source]¶
“Striving for Simplicity: the All Convolutional Net” https://arxiv.org/pdf/1412.6806.pdf Look at Figure 1 on page 8.
- class lecture2notes.models.slide_classifier.grad_cam.GradCAM(model, candidate_layers=None)[source]¶
“Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization” https://arxiv.org/pdf/1610.02391.pdf Look at Figure 2 on page 4
- class lecture2notes.models.slide_classifier.grad_cam.GuidedBackPropagation(model)[source]¶
“Striving for Simplicity: the All Convolutional Net” https://arxiv.org/pdf/1412.6806.pdf Look at Figure 1 on page 8.
- lecture2notes.models.slide_classifier.grad_cam.main(args)[source]¶
Visualize model responses given multiple images
- lecture2notes.models.slide_classifier.grad_cam.occlusion_sensitivity(model, images, ids, mean=None, patch=35, stride=1, n_batches=128)[source]¶
“Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization” https://arxiv.org/pdf/1610.02391.pdf Look at Figure A5 on page 17
Originally proposed in: “Visualizing and Understanding Convolutional Networks” https://arxiv.org/abs/1311.2901
Learning Rate Finder¶
- class lecture2notes.models.slide_classifier.lr_finder.ExponentialLR(optimizer, end_lr, num_iter, last_epoch=- 1)[source]¶
Exponentially increases the learning rate between two boundaries over a number of iterations.
- Parameters
optimizer (torch.optim.Optimizer) – wrapped optimizer.
end_lr (float, optional) – the initial learning rate which is the lower boundary of the test. Default: 10.
num_iter (int, optional) – the number of iterations over which the test occurs. Default: 100.
last_epoch (int) – the index of last epoch. Default: -1.
- class lecture2notes.models.slide_classifier.lr_finder.LRFinder(model, optimizer, criterion, device=None, memory_cache=True, cache_dir=None)[source]¶
Learning rate range test.
The learning rate range test increases the learning rate in a pre-training run between two boundaries in a linear or exponential manner. It provides valuable information on how well the network can be trained over a range of learning rates and what is the optimal learning rate.
- Parameters
model (torch.nn.Module) – wrapped model.
optimizer (torch.optim.Optimizer) – wrapped optimizer where the defined learning is assumed to be the lower boundary of the range test.
criterion (torch.nn.Module) – wrapped loss function.
device (str or torch.device, optional) – a string (“cpu” or “cuda”) with an optional ordinal for the device type (e.g. “cuda:X”, where is the ordinal). Alternatively, can be an object representing the device on which the computation will take place. Default: None, uses the same device as model.
memory_cache (boolean) – if this flag is set to True, state_dict of model and optimizer will be cached in memory. Otherwise, they will be saved to files under the cache_dir.
cache_dir (string) – path for storing temporary files. If no path is specified, system-wide temporary directory is used. Notice that this parameter will be ignored if memory_cache is True.
Example
>>> lr_finder = LRFinder(net, optimizer, criterion, device="cuda") >>> lr_finder.range_test(dataloader, end_lr=100, num_iter=100)Cyclical Learning Rates for Training Neural Networks: https://arxiv.org/abs/1506.01186 fastai/lr_find: https://github.com/fastai/fastai
- plot(skip_start=10, skip_end=5, log_lr=True)[source]¶
Plots the learning rate range test.
- Parameters
skip_start (int, optional) – number of batches to trim from the start. Default: 10.
skip_end (int, optional) – number of batches to trim from the start. Default: 5.
log_lr (bool, optional) – True to plot the learning rate in a logarithmic scale; otherwise, plotted in a linear scale. Default: True.
- range_test(train_loader, val_loader=None, end_lr=10, num_iter=100, step_mode='exp', smooth_f=0.05, diverge_th=5)[source]¶
Performs the learning rate range test.
- Parameters
train_loader (torch.utils.data.DataLoader) – the training set data laoder.
val_loader (torch.utils.data.DataLoader, optional) – if None the range test will only use the training loss. When given a data loader, the model is evaluated after each iteration on that dataset and the evaluation loss is used. Note that in this mode the test takes significantly longer but generally produces more precise results. Default: None.
end_lr (float, optional) – the maximum learning rate to test. Default: 10.
num_iter (int, optional) – the number of iterations over which the test occurs. Default: 100.
step_mode (str, optional) – one of the available learning rate policies, linear or exponential (“linear”, “exp”). Default: “exp”.
smooth_f (float, optional) – the loss smoothing factor within the [0, 1[ interval. Disabled if set to 0, otherwise the loss is smoothed using exponential smoothing. Default: 0.05.
diverge_th (int, optional) – the test is stopped when the loss surpasses the threshold: diverge_th * best_loss. Default: 5.
- class lecture2notes.models.slide_classifier.lr_finder.LinearLR(optimizer, end_lr, num_iter, last_epoch=- 1)[source]¶
Linearly increases the learning rate between two boundaries over a number of iterations.
- Parameters
optimizer (torch.optim.Optimizer) – wrapped optimizer.
end_lr (float, optional) – the initial learning rate which is the lower boundary of the test. Default: 10.
num_iter (int, optional) – the number of iterations over which the test occurs. Default: 100.
last_epoch (int) – the index of last epoch. Default: -1.
Mish¶
- lecture2notes.models.slide_classifier.mish.f_mish(input, inplace=False)[source]¶
Applies the mish function element-wise: \(mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + exp(x)))\)
- class lecture2notes.models.slide_classifier.mish.mish(inplace=False)[source]¶
Applies the mish function element-wise: \(mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + exp(x)))\)
- Shape:
Input:
(N, *)
where*
means, any number of additional dimensionsOutput:
(N, *)
, same shape as the inputExamples
>>> m = mish() >>> input = torch.randn(2) >>> output = m(input)
- training: bool¶
Inference¶
- lecture2notes.models.slide_classifier.inference.get_prediction(model, image, percent=False, extract_features=True)[source]¶
- lecture2notes.models.slide_classifier.inference.load_model(model_path='model_best.ckpt')[source]¶
Load saved model from model_path.
Slide Classifier Helpers¶
- lecture2notes.models.slide_classifier.slide_classifier_helpers.convert_relu_to_mish(model)[source]¶
Find all of the
nn.ReLU
activation functions inmodel
and replace them with mish.
- lecture2notes.models.slide_classifier.slide_classifier_helpers.plot_confusion_matrix(y_pred, y_true, classes, normalize=False, title='Confusion Matrix', cmap=<matplotlib.colors.LinearSegmentedColormap object>, save_path=None)[source]¶
This function prints and plots the confusion matrix. Normalization can be applied by setting normalize=True https://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html#sphx-glr-auto-examples-model-selection-plot-confusion-matrix-py.
Slide Classifier PyTorch¶
- class lecture2notes.models.slide_classifier.slide_classifier_pytorch.SlideClassifier(hparams)[source]¶
The main slide classifier model code.
- static calculate_stats(output, target)[source]¶
Used for the training, validation, and testing steps to calculate various statistics.
- Parameters
output (torch.tensor) – the output from the model
target (torch.tensor) – the ground-truth target classes
- Returns
a tuple of tensors in the form (accuracy, precision, recall, f_score)
- Return type
[tuple]
- forward(*args, **kwargs)[source]¶
Passes
*args
and**kwargs
toself.classification_model
sinceSlideClassifier
is a wrapper for the classification model.
- initialize_model(num_classes)[source]¶
Create the classification model. Modifies the standard models by adding extra layers to improve performance if feature_extract is advanced.
- Parameters
num_classes (int) – the number of classes in the data (number of output features)
- Returns
the modified pytorch model processed by the configuration options specified
- Return type
[pytorch model]
- prepare_data()[source]¶
Creates the PyTorch Datasets using
datasets.ImageFolder
and applying appropriate tranforms. Ifhparams.use_random_split
is True then the dataset will be randomly split 80% for training and 20% for testing. Ifhparams.use_random_split
is True then the dataset folder should contain a folder for each class. If it is False then there should be a folder for each split (named “train” and “val”) where each split folder contains a folder for each class. ImageFolder DocumentationThis function will also run :meth:~`slide_classifier_pytorch.SlideClassifier.initialize_model` with
len(self.hparams.classes))
as thenum_classes
argument if the classification model as not already been initialized in the__init__
function.
- set_parameter_requires_grad(model)[source]¶
This helper function sets the .requires_grad attribute of the parameters in the model to False when we are feature extracting. By default, when we load a pretrained model all of the parameters have .requires_grad=True, which is fine if we are training from scratch or finetuning. However, if we are feature extracting and only want to compute gradients for the newly initialized layer then we want all of the other parameters to not require gradients.
- test_dataloader()[source]¶
Return the validation dataloader. The test process uses the same data as validation but calculates a classification report and displays a confusion matrix.
- test_epoch_end(outputs)[source]¶
Create confusion matrix and calculate a sklearn classification report.
- test_step(batch, batch_idx)[source]¶
Perform a test step. See the PyTorch Lightning documentation for test_step for more info.
- train_dataloader()[source]¶
Create train dataloader if it has not already been created, otherwise return the stored dataloader.
- training: bool¶
- training_step(batch, batch_idx)[source]¶
Perform a training step. See the PyTorch Lightning Docs for more info.
- static validation_epoch_end(outputs, log_prefix='val')[source]¶
Compute average statistics after a validation epoch completes.
- validation_step(batch, batch_idx)[source]¶
Perform a validation step. See the PyTorch Lightning documentation for validation_step for more info.