We have covered quite a bit of ground in understanding the basics of GANs. In ghis section, we will apply that understanding and build a GAN from scratch. This generative model will consist of a repreating block architecture, similar to the one presented in the original paper. We will try to replicate the task of generating MNIST digits using our network.
The overall GAN setup can be seen in Figure 6.8. The figure outlines a generator model with moise vector z as input and repeating blocks that transform and scale up the vector to the required dimensions. Each block consists of a dense layer followed by Leaky ReLU activation and a batch-normalization layer, We simply reshape the output from the final block to transform it into the required output image size.
The descriminator, on the other hand, is a simple feedforward network. This model takes an image as input( a real image or the fake output from the generator) and classifies it as real or fake. This simple setup of two competing models helps us to train the overall GAN.
We will be relying on TensorFlow 2 and using the high-level Keras API wherever possible. The first step is to define the discriminator model. In this implementation, we will use a very basic multi-layer perceptron(MLP) as the discriminator model:
def build_discriminator(input_shape=(28,28,), verbose=True):
"""
Utility method to build a MLP discriminator
Parameters:
input_shape:
type:tuple, shape of input image for classification.
Default shape is (28,28)-> MNIST
verbose:
type:boolean. Print model summary if set to true.
Returns:
tensorflow.keras.model object
"""
model = Sequential()
model.add(Input(shape=input_shape))
model.add(Flatten())
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(Dense(1, activation='sigmoid'))
if vervose:
model.summary()
return model
We will use the sequential API to prepare this simple model, with just four layers and the final output layer with sigmoid activation. Since we have a binary classification task, we have only one unit in the final layer, We will use binary cross-entropy loss to train the discriminator model.
The generator model is also a multi-layer perceptron with multiple layers scaling up the noise vector z to the desired size. Since our task is to generate MNIST-like output samples, the final reshape layer will convert the flat vector into a 28*28 output shape. Note that we will make use of batch normalizaiton to stabilize model training. The following snippet shows a utility method for building the gene4rator model:
def build_generator(z_dim=100, output_shape=(28,28), verbose=True):
"""
Utility mothod to build a MLP generator
Parameters:
z_dim:
type:int(positive). Size of input noise vector to be used as model input.
default value is 100
output_shape: type:tuple. Shape of output image.
Default shape is (28,28)->MNIST
Returns:
tensorflow.keras.model object
"""
model = Sequential()
model.add(Input(shape=(z_dim,)))
model.add(Dense(256, input_dim=z_dim))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(np.prod(output_shape), activation='tanh'))
model.add(Reshape(output_shape))
if verbose:
model.summary()
return model
We simply use these utility methods to create generator and discriminator model objects. The following snippet uses these two model objects to create the GAN object as well:
discriminator = build_discriminator()
discriminator.compile(loss='binary_crossentropy',
optimizer=adam(0.0002, 0.5),
metrics=['accuracy'])
generator = build_Generator()
z_dim = 1000 #noise
z = Input(shape=(z_dim,))
img = generator(z)
#For the combined model we will only train the generator
discriminator.trainable = False
# The discriminator takes generated images as input
# and determines validity
validity =- discriminator(img)
#The combined model (stacked generator and discriminator)
# Trains the generator to fool the discriminator
gen_model = Model(z, validity)
gan_model.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5))
The final piece of the puzzle is defining the training loop. As described in the previous section, we will train both(discriminator and generator) models alternatingly. Doing so is straightforward with high-level Keras APIs. The following code snippet first loads the MNIST dataset and scales the pixel valuyes between -1 and +1: