Autoencoder: Denoise image using UpSampling2D and Conv2DTranspose Layers (Part: 2)

Ashrafur Rahman
6 min readMar 15, 2021
Photo by Bekky Bekks on Unsplash

For better understanding, this post is divided into three parts:

Part 1: GAN, Autoencoders: UpSampling2D and Conv2DTranspose

In this introductory part, I will cover fundamental terms and procedures used in this tutorial, enabling us to grasp the concept and comprehend the following sections of this tutorial more effectively.

Part 2: Denoising image with Upsampling Layer

This part will demonstrate how we can use upsampling method for denoising an image from their input. This part will be implemented using the notMNIST dataset.

Part 3: Denoising image with Transposed Convolution Layer

This part is similar to the previous part but I will use transposed convolution for denoising. This part will be covered using the infamous MNIST dataset.

Let’s start …

Part 2: Denoising image with Upsampling Layer

In this part, we will use UpSampling2D layer to denoise sample images from nonMNIST dataset. First will make noisy images by adding some noise to the dataset and then train our model using these images.

Dataset and related libraries

First, we need to import the necessary libraries for this project. We will use Keras with TensorFlow as a backend.

# importing libraries
import tensorflow as tf
import tensorflow.keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, Conv2DTranspose
from tensorflow.keras.constraints import max_norm
import matplotlib.pyplot as plt
import numpy as np
import gzip
%matplotlib inline

We can download the notMNIST dataset directly from the GitHub repository. We will use this repository for downloading notMNIST dataset.

# importing dataset from github
# link: https://github.com/davidflanagan/notMNIST-to-MNIST
! wget -O train-images.gz https://github.com/davidflanagan/notMNIST-to-MNIST/raw/master/train-images-idx3-ubyte.gz! wget -O test-images.gz https://github.com/davidflanagan/notMNIST-to-MNIST/raw/master/train-images-idx3-ubyte.gz! wget -O train-labels.gz https://github.com/davidflanagan/notMNIST-to-MNIST/raw/master/train-labels-idx1-ubyte.gz! wget -O test-labels.gz https://github.com/davidflanagan/notMNIST-to-MNIST/raw/master/t10k-labels-idx1-ubyte.gz

As you can see, all of the downloaded data are in compressed format, in this case .gz format. We can manually extract those images and use them in our project. But here we will define two functions to extract and load images from this compressed dataset. The first function will extract images and the second function for loading labels. We don’t need labels for training our model but for visualization purposes, we are extracting labels too.

# function for extracting images
def image_data(filename, num_images):
with gzip.open(filename) as f:
f.read(16)
buf = f.read(28 * 28 * num_images)
data = np.frombuffer(buf, dtype=np.uint8).astype(np.float32)
data = data.reshape(num_images, 28,28)
return data
# function for extracting labels
def image_labels(filename, num_images):
with gzip.open(filename) as f:
f.read(8)
buf = f.read(1 * num_images)
labels = np.frombuffer(buf, dtype=np.uint8).astype(np.int64)
return labels

Now let’s import all train and test images.

# import train and test data
train_data = image_data(‘train-images.gz’, 60000)
test_data = image_data(‘test-images.gz’, 10000)
# import train and test labels
train_labels = image_labels(‘train-labels.gz’, 60000)
test_labels = image_labels(‘test-labels.gz’, 10000)

After loading, we can check the shape of train and test images:

train_data.shape, test_data.shape

Output:

((60000, 28, 28), (10000, 28, 28))

So in our train dataset, we have 60000 images and in test samples, we have 10000 images. With our predefined function, they are already converted in NumPy array.

img_width, img_height = 28, 28
input_shape = (img_width, img_height, 1)
batch_size = 120
no_epochs = 50
validation_splits = 0.2
max_norm_value = 2.0
noise_factor = 0.5

Here we defined some variables for model building purpose. These variables will give us flexibility and help us to fine-tune our model and data in the future. Now we can proceed to the next step.

Exploring Dataset

NotMINST dataset labels each image as 1 to 9 digits which represent A to J letters respectively. So we will define a dictionary and label them with the original letter.

label_dict = {i: a for i,a in zip(range(10), ‘ABCDEFGHIJ’)}

Now we can visualize some random characters with their labels:

for i in range(num_viz):
digits = [[train_data[idx], train_labels[idx]] for idx in
np.random.randint(len(train_data), size=10)]
plt.figure(figsize=(len(digits), 1))
for i, data in enumerate(digits):
plt.subplot(1, len(digits), i+1)
plt.imshow(data[0])
plt.title(label_dict[data[1]])
plt.xticks([])
plt.yticks([])
plt.show()
Random samples from the dataset

This dataset is a collection of single-channel greyscale images. It is better to scale each image in[0,1] range because they are easier to deal with. This process is known as “normalization” or “transformation” and is part of the feature engineering process.

# scaling train and test images
train_data = train_data.reshape(-1, 28,28, 1)
test_data = test_data.reshape(-1, 28,28, 1)
train_data = train_data / np.max(train_data)
test_data = test_data / np.max(test_data)

Generating Noisy Images

For training, we need noisy images. We need to draw random noise sample from a normal (Gaussian) distribution. We can use NumPy function np.random.normal for this purpose. Here noise_factor determines how much noise we want to add to our image. As defined previously, we will use noise_factor = 0.5 in this example.

# create noisy image from dataset
noise_train = train_data + noise_factor *
np.random.normal(0,1,train_data.shape)
noise_test = test_data + noise_factor *
np.random.normal(0, 1, test_data.shape)

Let’s check noisy images against their original images:

# some random images for visualization
fig, ax = plt.subplots(1,15)
fig.set_size_inches(20, 4)
for i in range(15):
curr_img = np.reshape(train_data[i], (28,28))
curr_lbl = train_labels[i]
ax[i].imshow(curr_img)
#ax[i].set_title(f’Label: {label_dict[curr_lbl]}’)
ax[i].set_xticks([])
ax[i].set_yticks([])
plt.show()
fig, ax = plt.subplots(1,15)
fig.set_size_inches(20, 4)
for i in range(15):
curr_img = np.reshape(noise_train[i], (28,28))
curr_lbl = train_labels[i]
ax[i].imshow(curr_img)
#ax[i].set_title(f’Label: {label_dict[curr_lbl]}’)
ax[i].set_xticks([])
ax[i].set_yticks([])
plt.show()

Output:

Original image vs Noisy image

Seems they are noisy enough to train our model. Now we will define our model parameters.

Defining Model

Our model consists of several Conv2D and two UpSampling2D layers. We will use MaxPooling2D in our encoder part. Later we added one Conv2D layer for output. Layer parameters are self-explanatory and easily understandable.

# model layers for autoencoder
model = Sequential()
model.add(Conv2D(32, (3, 3),
activation='relu',
kernel_initializer='he_uniform',
padding='same',
input_shape=input_shape))
model.add(MaxPooling2D((2, 2), padding='same'))model.add(Conv2D(64, (3, 3),
activation='relu',
kernel_initializer='he_uniform',
padding='same'))
model.add(MaxPooling2D((2, 2), padding='same'))model.add(Conv2D(128, (3, 3),
activation='relu',
kernel_initializer='he_uniform',
padding='same'))
model.add(Conv2D(128, (3, 3),
activation='relu',
kernel_initializer='he_uniform',
padding='same'))
model.add(UpSampling2D((2, 2), interpolation='bilinear'))model.add(Conv2D(64, (3, 3),
activation='relu',
kernel_initializer='he_uniform',
padding='same'))
model.add(UpSampling2D((2, 2), interpolation='bilinear'))model.add(Conv2D(1, (3, 3),
activation='sigmoid',
padding='same'))
model.summary()

Output:

Model: "sequential" _________________________________________________________________ Layer (type)                 Output Shape              Param #    ================================================================= conv2d_28 (Conv2D)           (None, 28, 28, 32)        320        _________________________________________________________________ max_pooling2d_6 (MaxPooling2 (None, 14, 14, 32)        0          _________________________________________________________________ conv2d_29 (Conv2D)           (None, 14, 14, 64)        18496      _________________________________________________________________ max_pooling2d_7 (MaxPooling2 (None, 7, 7, 64)          0          _________________________________________________________________ conv2d_30 (Conv2D)           (None, 7, 7, 128)         73856      _________________________________________________________________ conv2d_31 (Conv2D)           (None, 7, 7, 128)         147584     _________________________________________________________________ up_sampling2d_9 (UpSampling2 (None, 14, 14, 128)       0          _________________________________________________________________ conv2d_32 (Conv2D)           (None, 14, 14, 64)        73792      _________________________________________________________________ up_sampling2d_10 (UpSampling (None, 28, 28, 64)        0          _________________________________________________________________ conv2d_33 (Conv2D)           (None, 28, 28, 1)         577        ================================================================= Total params: 314,625 
Trainable params: 314,625
Non-trainable params: 0 _________________________________________________________________

The dot plot of this model shows the structure for our model.

Model to DOT plot

We will use adam as an optimizer and binary_crossentropyas a loss function. We will use 50 epochs for this training.

# compiling and fitting model
model.compile(optimizer=’adam’,
loss = ‘binary_crossentropy’)
model.fit(noise_train,
train_data,
validation_split= validation_splits,
epochs=no_epochs,
batch_size=batch_size)

Prediction and Visualization

Now we can predict some test samples and visualize them.

# model prediction
fig_samples = noise_test[:10]
fig_original = test_data[:10]
fig_denoise = model.predict(fig_samples)

Let’s compare our predicted denoised images with original test images for better understanding.

# output visualization
for i in range(0, 5):
noisy_img = noise_test[i]
original_img = test_data[i]
denoise_img = fig_denoise[i]

fig, axes = plt.subplots(1, 3)
fig.set_size_inches(6, 2.8)
axes[0].imshow(noisy_img.reshape(28, 28))
axes[0].set_xticks([])
axes[0].set_yticks([])
axes[0].set_title('Noisy')
axes[1].imshow(original_img.reshape(28, 28))
axes[1].set_xticks([])
axes[1].set_yticks([])
axes[1].set_title('Original')
axes[2].imshow(denoise_img.reshape(28, 28))
axes[2].set_xticks([])
axes[2].set_yticks([])
axes[2].set_title('Denoised')
plt.show()

Output

prediction comparison

The output is pretty outstanding in this case. Our generated images are quite similar to original images. We can fine-tune and try to make it better for other datasets also. This is the basic concept for this model and it's up to the user how they want to play with them.

Hope you get the idea of autoencoder and denoising images. We will develop another model using Conv2DTranspose layer using different datasets in the next part of the tutorial.

All code samples for this part can be found here: Colab Link

🅽🅴🆇🆃 ⫸ Part 3: Denoising image using Transposed Convolution Layer

Happy coding!

--

--