Large-scale image classification with Keras - vgg16 fine-tune -

⏱️3 min read
Share:

1 Introduction

I recently started learning machine learning and participating in Kaggle, the competition for image classification. This time I will summarize how to implement vgg16 fine-tune with python keras library [^2].

2 About data

The data set is as follows. I had only experienced data set like MNIST which were often seen in ML tutorials, so I felt it was a very big data set.

  • Classes:about 15,000
  • Size of learning data:about 1.2 million image files. More than 300GB
  • Size of each image file:varies among files. e.g. 1,600*1,200
  • Test data:about 120,000 files

3 Implementation

3.1 Learn data

First I'll show you entire code.

python
1from keras.preprocessing.image import ImageDataGenerator
2from keras import optimizers
3from keras.applications.vgg16 import VGG16
4from keras.layers import Dense, Dropout, Flatten, Input, BatchNormalization
5from keras.models import Model, Sequential
6from keras.callbacks import ModelCheckpoint
7import numpy as np
8
9train_data_dir = "/train/"
10validation_data_dir = "/validation/"
11
12train_datagen = ImageDataGenerator(rescale=1. / 255)
13validation_datagen = ImageDataGenerator(rescale=1. / 255)
14
15img_width, img_height = 200, 150
16nb_train_samples = 915649
17nb_validation_samples = 302091
18epochs = 50
19batch_size = 64
20nb_category = 14951
21
22train_generator = train_datagen.flow_from_directory(
23 train_data_dir,
24 target_size=(img_width, img_height),
25 batch_size=batch_size,
26 class_mode="categorical")
27
28validation_generator = validation_datagen.flow_from_directory(
29 validation_data_dir,
30 target_size=(img_width, img_height),
31 batch_size=batch_size,
32 class_mode="categorical")
33
34# define input_tensor
35input_tensor = Input(shape=(img_width, img_height, 3))
36
37vgg16 = VGG16(include_top=False, weights='imagenet', input_tensor=input_tensor)
38
39top_model = Sequential()
40top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
41top_model.add(Dense(256, activation='relu', kernel_initializer='he_normal'))
42top_model.add(BatchNormalization())
43top_model.add(Dropout(0.5))
44top_model.add(Dense(nb_category, activation='softmax'))
45
46# Connect vgg16 and top_model
47model = Model(inputs=vgg16.input, outputs=top_model(vgg16.output))
48
49# Fix layers
50for layer in model.layers[:15]:
51 layer.trainable = False
52
53optimizer = optimizers.rmsprop(lr=5e-7, decay=5e-5)
54model.compile(loss='categorical_crossentropy',
55 optimizer=optimizer,
56 metrics=['accuracy'])
57
58checkpoint_cb = ModelCheckpoint("snapshot/{epoch:03d}-{val_acc:.5f}.hdf5", save_best_only=True)
59
60model.fit_generator(
61 train_generator,
62 steps_per_epoch=nb_train_samples // batch_size,
63 epochs=epochs,
64 validation_data=validation_generator,
65 validation_steps=nb_validation_samples // batch_size,
66 callbacks=[checkpoint_cb])
67
68# Save the model
69model.save("model.h5")
70
71model.summary()

3.1.1 Learn with large scale data

In most tutorials, we can see data is loaded as follows.

python
1(X_train, y_train), (X_test, y_test) = mnist.load_data()

However, it is not possible to load whole data on memory when you treat large scale data like this time. Instead, you can use flow_from_directory function to load data [^1]. This function processes image data while expanding the data in real time.

First you need to create ImageDataGenerator.

python
1train_datagen = ImageDataGenerator(rescale=1. / 255)
2validation_datagen = ImageDataGenerator(rescale=1. / 255)

This class performs batch generation of image data while performing preprocessing. This time, simply rescale. 1/255 is used to normalize the RGB value range of 0-255 to 0-1.

Next, read data

python
1train_generator = train_datagen.flow_from_directory(
2 train_data_dir,
3 target_size=(img_width, img_height),
4 batch_size=batch_size,
5 class_mode="categorical")
6
7validation_generator = validation_datagen.flow_from_directory(
8 validation_data_dir,
9 target_size=(img_width, img_height),
10 batch_size=batch_size,
11 class_mode="categorical")

Since it's a categorical classification, set class_mode as categorical. At this time, you must be careful about the folder structure. As shown below, you must create a sub-folder for each class you want to classify.

javascript
1data/
2 train/
3 classA/
4 aaa.jpg
5 bbb.jpg
6 ...
7 classB/
8 ccc.jpg
9 ddd.jpg
10 ...
11
12 validation/
13 classA/
14 eee.jpg
15 fff.jpg
16 ...
17 classB/
18 ggg.jpg
19 hhh.jpg
20 ...

3.1.2 Use vgg16 model

Learn using the VGG16 model that can be used with Keras.

python
1# define input_tensor
2input_tensor = Input(shape=(img_width, img_height, 3))
3
4vgg16 = VGG16(include_top=False, weights='imagenet', input_tensor=input_tensor)
5
6top_model = Sequential()
7top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
8top_model.add(Dense(256, activation='relu', kernel_initializer='he_normal'))
9top_model.add(BatchNormalization())
10top_model.add(Dropout(0.5))
11top_model.add(Dense(nb_category, activation='softmax'))
12
13# Connect vgg16 with top_model
14model = Model(inputs=vgg16.input, outputs=top_model(vgg16.output))
15
16# Fix layers
17for layer in model.layers[:15]:
18 layer.trainable = False
19
20optimizer = optimizers.rmsprop(lr=5e-7, decay=5e-5)
21model.compile(loss='categorical_crossentropy',
22 optimizer=optimizer,
23 metrics=['accuracy'])
24
25checkpoint_cb = ModelCheckpoint("snapshot/{epoch:03d}-{val_acc:.5f}.hdf5", save_best_only=True)
26
27model.fit_generator(
28 train_generator,
29 steps_per_epoch=nb_train_samples // batch_size,
30 epochs=epochs,
31 validation_data=validation_generator,
32 validation_steps=nb_validation_samples // batch_size,
33 callbacks=[checkpoint_cb])

First, use the default vgg16 model. At this time, the size of the input data set is specified.

python
1input_tensor = Input(shape=(img_width, img_height, 3))
2
3vgg16 = VGG16(include_top=False, weights='imagenet', input_tensor=input_tensor)

Next, attach your own model. Note that include_top in the above argument must be set to False for fine-tune purpose.

python
1top_model = Sequential()
2top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
3top_model.add(Dense(256, activation='relu', kernel_initializer='he_normal'))
4top_model.add(BatchNormalization())
5top_model.add(Dropout(0.5))
6top_model.add(Dense(nb_category, activation='softmax'))
7
8# Connect vgg16 with top_model
9model = Model(inputs=vgg16.input, outputs=top_model(vgg16.output))

To be honest, I don't fully understand the relu function specified here. So I don't think the contents of this model are very helpful.

Next, the weight is fixed. If this is not specified, weights will be learned again from the beginning, but this time it will be fixed.

python
1for layer in model.layers[:15]:
2 layer.trainable = False

Compile the model. At this time, specify optimizer. This specifies what algorithm to use when updating parameters.

python
1optimizer = optimizers.rmsprop(lr=5e-7, decay=5e-5)
2model.compile(loss='categorical_crossentropy',
3 optimizer=optimizer,
4 metrics=['accuracy'])

Finally learning. If ModelCheckpoint is set in callbacks, intermediate results can be saved. It is not mandatory.

python
1checkpoint_cb = ModelCheckpoint("snapshot/{epoch:03d}-{val_acc:.5f}.hdf5", save_best_only=True)
2
3model.fit_generator(
4 train_generator,
5 steps_per_epoch=nb_train_samples // batch_size,
6 epochs=epochs,
7 validation_data=validation_generator,
8 validation_steps=nb_validation_samples // batch_size,
9 callbacks=[checkpoint_cb])

3.2 Predict

python
1from keras.preprocessing.image import ImageDataGenerator
2from keras import optimizers
3from keras.applications.vgg16 import VGG16
4from keras.layers import Dense, Dropout, Flatten, Input, BatchNormalization
5from keras.models import Model, Sequential, load_model
6import pandas as pd
7import numpy as np
8import os
9from pandas import DataFrame
10
11test_data_dir = "/test/"
12
13test_datagen = ImageDataGenerator(rescale=1. / 255)
14
15img_width, img_height = 200, 150
16nb_test_samples = 115474
17batch_size = 1
18nb_category = 14951
19
20test_generator = test_datagen.flow_from_directory(
21 test_data_dir,
22 target_size=(img_width, img_height),
23 batch_size=batch_size,
24 class_mode=None,
25 shuffle=False)
26
27model = load_model("model.h5")
28
29pred = model.predict_generator(
30 test_generator,
31 steps=nb_test_samples,
32 verbose=1)

It is not so much different from learning code. Similarly, data is read with flow_from_directory, but the point to note is that sub-folders are required even for test data (= classes unknown). For example, in the following configuration, it will not work unless you organize the data in sub-folders.

javascript
1data/
2 test/
3 sub/
4 aaa.jpg
5 bbb.jpg
6 ...

Since the class is unknown, set class_mode as None.

4 Summary

  • Understood how to use vgg16 with Keras.
  • Load large-scale data with flow_from_directory.
  • Note that you need sub-folders for prediction as well.

5 Reference

[^1]: Building powerful image classification models using very little data [^2]: Keras Documentation

Share:

Related Articles

Android application CI/CD with Flutter
Software

Android application CI/CD with Flutter

Learn how to build CI/CD pipeline for Android apps using Flutter. Based on GitHub Actions, not using fastlane.

mark241
How to localize app in Flutter
Software

How to localize app in Flutter

Learn how to localize your Flutter app using arb files. This article is based on Flutter 2.0.1.

mark241
Describe Azure resources as ARM Template
Software

Describe Azure resources as ARM Template

ARM Template is a json file that defines Azure resources. Learn how to create ARM Templates efficiently for deploying new resources.

mark241