Autoencoder: Denoise image using UpSampling2D and Conv2DTranspose Layers (Part: 3)
For better understanding, this post is divided into three parts:
Part 1: GAN, Autoencoders: UpSampling2D and Conv2DTranspose
In this introductory part, I will cover fundamental terms and procedures used in this tutorial, enabling us to grasp the concept and comprehend the following sections of this tutorial more effectively.
Part 2: Denoising image with Upsampling Layer
This part will demonstrate how we can use upsampling method for denoising an image from their input. This part will be implemented using the notMNIST dataset.
Part 3: Denoising image with Transposed Convolution Layer
This part is similar to the previous part but I will use transposed convolution for denoising. This part will be covered using the infamous MNIST dataset.
Let’s start …
Part 3: Denoising image with Transposed Convolution Layer
In this part, we will use the handwritten image dataset name MNIST dataset. This is a well-known dataset and needs no introduction. So we will import the necessary libraries and load the dataset in our project.
Dataset and related libraries
As usual, we will use Keras with TensorFlow as a backend.
# importing libraries
import tensorflow as tf
import tensorflow.kerasfrom tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, Conv2DTranspose
from tensorflow.keras.constraints import max_normimport matplotlib.pyplot as plt
import numpy as np%matplotlib inline
TensorFlow Datasets provides a collection of ready-to-use datasets for use with TensorFlow. We will import the MNIST dataset using load_data()
method.
# loading dataset(x_train, y_train), (x_test, y_test) =
tf.keras.datasets.mnist.load_data()# dataset information
print("Number of original training examples:", len(x_train))
print("Number of original test examples:", len(x_test))
print("Shape of a single image:", x_train[0].shape)
Output:
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 0s 0us/step Number of original training examples: 60000
Number of original test examples: 10000
Shape of a single image: (28, 28)
As we can see, our MNIST dataset consists of 60000 training images and 10000 test images. Each image has a 28x28 dimension and a single gray channel. We can scale images to [0.0,1.0]
range for better handling in our model.
# scale the images from [0,255] to the [0.0,1.0] range
x_train, x_test = x_train[..., np.newaxis]/255.0,
x_test[..., np.newaxis]/255.0
Now we will define some variables for our model.
img_width, img_height = 28, 28
input_shape = (img_width, img_height, 1)
batch_size = 120
no_epochs = 50
max_norm_value = 2.0
validation_splits = 0.2
noise_factor = 0.5
Exploring Dataset
Now we will visualize some random dataset samples.
# some random images for visualization
for i in range(6):
digits = [[x_train[idx], y_train[idx]] for idx in
np.random.randint(len(x_train), size=10)] plt.figure(figsize=(len(digits), 1))
for i, data in enumerate(digits):
plt.subplot(1, len(digits), i+1)
plt.imshow(data[0].reshape(28,28))
plt.title(data[1])
plt.xticks([])
plt.yticks([])
plt.show()
Output:
Generating Noisy Images
As we have generated noisy images in our previous part, we will generate noisy image using noise_factor = 0.5
.
# create noisy image from dataset
noise_train = x_train + noise_factor *
np.random.normal(0,1,x_train.shape)
noise_test = x_test + noise_factor *
np.random.normal(0, 1, x_test.shape)
We can check the noisy images against the original images:
fig, ax = plt.subplots(1,15)
fig.set_size_inches(20, 4)for i in range(15):
curr_img = np.reshape(x_train[i], (28,28))
ax[i].imshow(curr_img)
ax[i].set_xticks([])
ax[i].set_yticks([])
plt.show()fig, ax = plt.subplots(1,15)
fig.set_size_inches(20, 4)for i in range(15):
curr_img = np.reshape(noise_train[i], (28,28))
ax[i].imshow(curr_img)
ax[i].set_xticks([])
ax[i].set_yticks([])
plt.show()
Output:
Defining Model
Our model consists of several Conv2D and two Conv2DTranspose layers. As an output, we will add one Conv2D layer. Layer parameters are self-explanatory and easily understandable.
# model layers for autoencodermodel = Sequential()
model.add(Conv2D(64, kernel_size=(3,3),
kernel_constraint=max_norm(max_norm_value),
activation = 'relu',
kernel_initializer= 'he_uniform',
input_shape = input_shape))model.add(Conv2D(32, kernel_size=(3,3),
kernel_constraint=max_norm(max_norm_value),
activation='relu',
kernel_initializer='he_uniform'))model.add(Conv2DTranspose(32,
kernel_size=(3,3),
kernel_constraint=max_norm(max_norm_value),
activation='relu',
kernel_initializer='he_uniform'))model.add(Conv2DTranspose(64, kernel_size=(3,3),
kernel_constraint= max_norm(max_norm_value),
activation='relu',
kernel_initializer= 'he_uniform'))model.add(Conv2D(1, kernel_size=(3,3),
kernel_constraint=max_norm(max_norm_value),
activation='sigmoid',
padding = 'same'))
model.summary()
Output:
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 26, 26, 64) 640 _________________________________________________________________ conv2d_1 (Conv2D) (None, 24, 24, 32) 18464 _________________________________________________________________ conv2d_transpose (Conv2DTran (None, 26, 26, 32) 9248 _________________________________________________________________ conv2d_transpose_1 (Conv2DTr (None, 28, 28, 64) 18496 _________________________________________________________________ conv2d_2 (Conv2D) (None, 28, 28, 1) 577 ================================================================= Total params: 47,425
Trainable params: 47,425
Non-trainable params: 0 _________________________________________________________________
We will use adam
as an optimizer and binary_crossentropy
as a loss function. We will use 50 epochs for this training.
# model compilation & fitting
model.compile(optimizer=’adam’, loss = ‘binary_crossentropy’)model.fit(noise_train, x_train, validation_split= validation_splits, epochs=no_epochs, batch_size=batch_size)
Prediction and Visualization
Now we can predict some test samples and visualize them.
# model prediction
fig_samples = noise_test[:10]
fig_original = x_test[:10]
fig_denoise = model.predict(fig_samples)
Let’s compare our predicted denoised images with original test images for better understanding.
for i in range(0, 6):
noisy_img = noise_test[i]
original_img = x_test[i]
denoise_img = fig_denoise[i] fig, axes = plt.subplots(1, 3)
fig.set_size_inches(6, 2.8) axes[0].imshow(noisy_img.reshape(28, 28))
axes[0].set_xticks([])
axes[0].set_yticks([])
axes[0].set_title('Noisy') axes[1].imshow(original_img.reshape(28, 28))
axes[1].set_xticks([])
axes[1].set_yticks([])
axes[1].set_title('Original')
axes[2].imshow(denoise_img.reshape(28, 28))
axes[2].set_xticks([])
axes[2].set_yticks([])
axes[2].set_title('Denoised')plt.show()
Output:
As you can see, it is possible to get an impressive result using a simple implementation. So that’s the idea of an autoencoder and how we can use it for denoising images. In this part, I demonstrated how we can use Conv2DTranspose
for this purpose. Hope this will be helpful for your future learning.
All code samples for this part can be found here: Colab Link
Happy coding!