Large-scale image classification with Keras - vgg16 fine-tune -

Last Update: 6/4/2018

Python

Machine Learning

1 Introduction

I recently started learning machine learning and participating in Kaggle, the competition for image classification. This time I will summarize how to implement vgg16 fine-tune with python keras library ².

2 About data

The data set is as follows. I had only experienced data set like MNIST which were often seen in ML tutorials, so I felt it was a very big data set.

Classes：about 15,000
Size of learning data：about 1.2 million image files. More than 300GB
Size of each image file：varies among files. e.g. 1,600*1,200
Test data：about 120,000 files

3 Implementation

3.1 Learn data

First I'll show you entire code.

train.py

from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.applications.vgg16 import VGG16
from keras.layers import Dense, Dropout, Flatten, Input, BatchNormalization
from keras.models import Model, Sequential
from keras.callbacks import ModelCheckpoint
import numpy as np

train_data_dir = "/train/"
validation_data_dir = "/validation/"

train_datagen = ImageDataGenerator(rescale=1. / 255)
validation_datagen = ImageDataGenerator(rescale=1. / 255)

img_width, img_height = 200, 150
nb_train_samples = 915649
nb_validation_samples = 302091
epochs = 50
batch_size = 64
nb_category = 14951

train_generator = train_datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode="categorical")

validation_generator = validation_datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode="categorical")

# define input_tensor
input_tensor = Input(shape=(img_width, img_height, 3))

vgg16 = VGG16(include_top=False, weights='imagenet', input_tensor=input_tensor)

top_model = Sequential()
top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
top_model.add(Dense(256, activation='relu', kernel_initializer='he_normal'))
top_model.add(BatchNormalization())
top_model.add(Dropout(0.5))
top_model.add(Dense(nb_category, activation='softmax'))

# Connect vgg16 and top_model
model = Model(inputs=vgg16.input, outputs=top_model(vgg16.output))

# Fix layers
for layer in model.layers[:15]:
    layer.trainable = False

optimizer = optimizers.rmsprop(lr=5e-7, decay=5e-5)
model.compile(loss='categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])

checkpoint_cb = ModelCheckpoint("snapshot/{epoch:03d}-{val_acc:.5f}.hdf5", save_best_only=True)

model.fit_generator(
        train_generator,
        steps_per_epoch=nb_train_samples // batch_size,
        epochs=epochs,
        validation_data=validation_generator,
        validation_steps=nb_validation_samples // batch_size,
        callbacks=[checkpoint_cb])

# Save the model
model.save("model.h5")

model.summary()

3.1.1 Learn with large scale data

In most tutorials, we can see data is loaded as follows.

how_to_load_data

(X_train, y_train), (X_test, y_test) = mnist.load_data()

However, it is not possible to load whole data on memory when you treat large scale data like this time. Instead, you can use flow_from_directory function to load data ¹. This function processes image data while expanding the data in real time.

First you need to create ImageDataGenerator.

Create_ImageDataGenerator

train_datagen = ImageDataGenerator(rescale=1. / 255)
validation_datagen = ImageDataGenerator(rescale=1. / 255)

This class performs batch generation of image data while performing preprocessing. This time, simply rescale. 1/255 is used to normalize the RGB value range of 0-255 to 0-1.

Next, read data

Read_data

train_generator = train_datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode="categorical")

validation_generator = validation_datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode="categorical")

Since it's a categorical classification, set class_mode as categorical. At this time, you must be careful about the folder structure. As shown below, you must create a sub-folder for each class you want to classify.

Folder_structure

data/
     train/
           classA/
                 aaa.jpg
                 bbb.jpg
                 ...
           classB/
                 ccc.jpg
                 ddd.jpg
                 ...

     validation/
           classA/
                 eee.jpg
                 fff.jpg
                 ...
           classB/
                 ggg.jpg
                 hhh.jpg
                 ...

3.1.2 Use vgg16 model

Learn using the VGG16 model that can be used with Keras.

fine-tune_vgg16_model

# define input_tensor
input_tensor = Input(shape=(img_width, img_height, 3))

vgg16 = VGG16(include_top=False, weights='imagenet', input_tensor=input_tensor)

top_model = Sequential()
top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
top_model.add(Dense(256, activation='relu', kernel_initializer='he_normal'))
top_model.add(BatchNormalization())
top_model.add(Dropout(0.5))
top_model.add(Dense(nb_category, activation='softmax'))

# Connect vgg16 with top_model
model = Model(inputs=vgg16.input, outputs=top_model(vgg16.output))

# Fix layers
for layer in model.layers[:15]:
    layer.trainable = False

optimizer = optimizers.rmsprop(lr=5e-7, decay=5e-5)
model.compile(loss='categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])

checkpoint_cb = ModelCheckpoint("snapshot/{epoch:03d}-{val_acc:.5f}.hdf5", save_best_only=True)

model.fit_generator(
        train_generator,
        steps_per_epoch=nb_train_samples // batch_size,
        epochs=epochs,
        validation_data=validation_generator,
        validation_steps=nb_validation_samples // batch_size,
        callbacks=[checkpoint_cb])

First, use the default vgg16 model. At this time, the size of the input data set is specified.

Create_model

input_tensor = Input(shape=(img_width, img_height, 3))

vgg16 = VGG16(include_top=False, weights='imagenet', input_tensor=input_tensor)

Next, attach your own model. Note that include_top in the above argument must be set to False for fine-tune purpose.

Connect_model

top_model = Sequential()
top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
top_model.add(Dense(256, activation='relu', kernel_initializer='he_normal'))
top_model.add(BatchNormalization())
top_model.add(Dropout(0.5))
top_model.add(Dense(nb_category, activation='softmax'))

# Connect vgg16 with top_model
model = Model(inputs=vgg16.input, outputs=top_model(vgg16.output))

To be honest, I don't fully understand the relu function specified here. So I don't think the contents of this model are very helpful.

Next, the weight is fixed. If this is not specified, weights will be learned again from the beginning, but this time it will be fixed.

Fix_layers

for layer in model.layers[:15]:
    layer.trainable = False

Compile the model. At this time, specify optimizer. This specifies what algorithm to use when updating parameters.

Compile_the_model

optimizer = optimizers.rmsprop(lr=5e-7, decay=5e-5)
model.compile(loss='categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])

Finally learning. If ModelCheckpoint is set in callbacks, intermediate results can be saved. It is not mandatory.

Learning

checkpoint_cb = ModelCheckpoint("snapshot/{epoch:03d}-{val_acc:.5f}.hdf5", save_best_only=True)

model.fit_generator(
        train_generator,
        steps_per_epoch=nb_train_samples // batch_size,
        epochs=epochs,
        validation_data=validation_generator,
        validation_steps=nb_validation_samples // batch_size,
        callbacks=[checkpoint_cb])

3.2 Predict

Predict

from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.applications.vgg16 import VGG16
from keras.layers import Dense, Dropout, Flatten, Input, BatchNormalization
from keras.models import Model, Sequential, load_model
import pandas as pd
import numpy as np
import os
from pandas import DataFrame

test_data_dir = "/test/"

test_datagen = ImageDataGenerator(rescale=1. / 255)

img_width, img_height = 200, 150
nb_test_samples = 115474
batch_size = 1
nb_category = 14951

test_generator = test_datagen.flow_from_directory(
        test_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

model = load_model("model.h5")

pred = model.predict_generator(
        test_generator,
        steps=nb_test_samples,
        verbose=1)

It is not so much different from learning code. Similarly, data is read with flow_from_directory, but the point to note is that sub-folders are required even for test data (= classes unknown). For example, in the following configuration, it will not work unless you organize the data in sub-folders.