Transfer Learning Using DenseNet201

Convolutional Neural Networks (CNN)

In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network, most commonly applied to analyze visual imagery. They have applications in image and video recognition, recommender systems, image classification, image segmentation, medical image analysis, natural language processing, brain-computer interfaces, and financial time series.

Transfer Learning

Transfer learning is a research problem in machine learning that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem.

For example, it may be easier and faster for someone who can ride a bicycle to learn to use a scooter than someone who cannot. In these events, which are similar to each other, the person uses the ability to stay in balance while riding a scooter and transfers learning without realizing it.

Transfer learning is the storage of the information obtained while solving a problem and using that information when faced with another problem. With transfer learning, models that show higher success and learn faster with less training data are obtained by using previous knowledge.

The best part about transfer learning is that we can use part of the trained model without having to train the whole model. In this way, we save time with transfer learning.

Dense Convolutional Network (DenseNet)

Dense Convolutional Network (DenseNet) is connects each layer to every other layer in a feed-forward fashion. They alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.

DenseNet works on the idea that convolutional networks can be substantially deeper, more accurate, and efficient to train if they have shorter connections between layers close to the input and those close to the output. The figure below is from the original paper which gives a nice visualization of scaling.

**Source:** https://arxiv.org/pdf/1608.06993.pdf

Note: DenseNet comes in a lot of variants. I used DenseNet-201 because it’s a small model. If you want you can try out other variants of DenseNet.

Let’s get started!

Model Training Using Transfer Learning

In this tutorial, you will learn how to classify images of alien and predator by using transfer learning from a pre-trained network.

A pre-trained model has been previously trained on a dataset and contains the weights and biases that represent the features of whichever dataset it was trained on. You either use the pretrained model as is or use transfer learning to customize this model to a given task. We will use the second method.

Let’s start with the table of contents.

Data — download dataset

Initialize — importing libraries, specifying file paths, defining constant variables

Prepeating the Image — preparing the data set for the model part with various methods

Model — establishing the model and evaluating the results

Connecting to Drive

We need to connect to Drive and go to the directory where our code file is located.

Data

In order to download our dataset, we download a .json file from our kaggle account and specify the path to the downloaded file.

We downloaded the dataset.

We extracted the downloaded dataset from the .zip file.

Initialize

Let us first import the libraries.

Then let’s define the paths of train and validation data.

We have determined the batch size and the number of epochs.

DenseNet-201 model architecture requires the image to be of size (224, 224). So, let us resize our images.

Preparing The Images

We have loaded the train and validation data. Since there are 2 classes in our dataset, we made the label_mode binary.

We printed the number of images in 2 separate files (alien, predator) in the train folder.

Visualize The Images

Image Augmentation

With Data Augmentation, we increase the number of images in our dataset.

Create Test Dataset

To create a test dataset, we moved 20% of our validation data to the test_dataset variable.

Improvements in Dataset

We cached the datasets to gain speed and shuffled the images to reduce the test error. In this way, we have ensured that we get the maximum efficiency by synchronizing the read-train times.

Visualize Original And Augmented Images

Let’s visualize it to better understand what augmented images look like.

Model

Necessary Importers

The libraries we need to import to create the model.

CNN Architecture

The DenseNet class is available in Keras to help in transfer learning with ease. I used the DenseNet-201 class with ImageNet weights. We rescaled our data set in accordance with the DenseNet model that we will use in feature extraction, and we created our base model by adding the data we created artificially.

By setting the trainable property of this model we created to False, we prevented the weights in non-trainable layers from being updated. Otherwise, what the model learned would be destroyed.

Since I used this model just for feature extraction, I did not include the fully-connected layer at the top of the network instead specified the input shape and pooling. I also added my own pooling and dense layers.

Here is the code to use the pre-trained DenseNet-201 model.

My base model has no trainable parameters because its trainable property is False. There are 1,921 trainable parameters in our 2nd model with dense parts added.

Fit The Model

Here are the snippets of training.

We can see that the model adjusted the learning rate on the 24th epoch and we get a final validation-accuracy of 93.75% on the validation set, which is pretty good. But wait, we need to look at the test accuracy too.

Evaluate

Let’s visualize the loss and accuracy against number of epochs.

We got an accuracy of 89.99% on the test dataset.

Results

Learning transfer is a technique used to enable existing algorithms to achieve higher performance in a shorter time with less data. Although the positive aspects of this method are strong, there are situations that should be considered regarding learning transfer. Transfer learning only works if the initial and target problems are similar enough for the first round of training to be relevant. In such cases, it is thought that the source data and the target data are very different from each other and the problem of negative transfer occurs. If the first round of training is too far off the mark, the model may actually perform worse than if it had never been trained at all. Right now, there are still no clear standards on what types of training are sufficiently related, or how this should be measured.

Deep learning is all about experimentation. You can improve the performance of your model by using a different version of DenseNet-201 or by transfer learning with a completely different model. You can also make major changes to your model by tampering with the hyperparameter tuning.

I hope the blog helped you in understanding how to perform transfer learning on CNN. Please feel free to experiment more to get better performance. You can find the source code at this link.

Reference:

This Kaggle notebook gave me an idea of how to do transfer learning with DenseNet.

This post on Quora guided me as I explored the disadvantages of transfer learning.

CNN and Transfer Learning definitions from Wikipedia.

Thanks everyone for reading this. I hope it was worth your time. Please feel free to share your valuable feedback or suggestion.

Author