Transfer Learning with Keras!

嗨,各位好久不见啦!

最近因为 系上学长要求(误),开始学习迁移式学习(transfer learning)顺便记录过程。目前也还在学习中,所以有误或有更专业方法的话,欢迎各位不吝啬给予反馈与建议!是说本来想说铁人比赛结束就不用了,不过发现自己的文章有人追蹤订阅,所以就还是写写东西好了。

什么是迁移学习?

迁移学习就是把已经训练过的模型参数迁移到新的模型,帮助新模型的训练,加快新模型的训练的效率,因为不需要从零开始学习,当然网路上可以找到更详细的介绍,这里就不细说明啰。

那直接开始吧!

optional 通常使用这种会跑警告的套件我都会先把警告隐藏哈(看到警告心情会不好。)
import warningswarnings.filterwarnings('ignore')

如标题所示,这里我主要是用keras而不是tensorflow,keras是已经被包装过的深度学习套件,适合新手(我)使用!当然除了深度学习套件以外还有基本的Data Science会用到的套件(在我的铁人文章有介绍过)。

import numpy as npimport kerasfrom keras import backend as Kfrom keras.models import Sequentialfrom keras.layers import Activationfrom keras.layers.core import Dense, Flattenfrom keras.optimizers import Adamfrom keras.metrics import categorical_crossentropyfrom keras.preprocessing.image import ImageDataGeneratorfrom keras.layers.normalization import BatchNormalizationfrom keras.layers.convolutional import *from sklearn.metrics import confusion_matrixfrom keras.models import load_modelimport itertoolsimport matplotlib.pyplot as pltfrom PIL import Image

目录tree:

分别有训练train测试test验证valid资料集,里面分别放入的资料夹,里面再放入猫和狗的图片档。

.├── transfer-learning-with-keras.ipynb├── test/│   ├── cats/│   └── dogs/├── train/│   ├── cats/│   └── dogs/└── valid/    ├── cats/    └── dogs/    

定义目录路径:

train_path = './train'valid_path = './valid'test_path = './test'

图片预处理:

ImageDataGenerator顾名思义就是用来产生图片资料的:
用以生成一个批次的图像数据。训练时该函数会无限生成数据,直到达到规定的epoch次数为止。

train_batches = ImageDataGenerator().flow_from_directory(train_path, target_size=(224,224), classes=['dogs', 'cats'], batch_size=10)valid_batches = ImageDataGenerator().flow_from_directory(valid_path, target_size=(224,224), classes=['dogs', 'cats'], batch_size=4)test_batches = ImageDataGenerator().flow_from_directory(test_path, target_size=(224,224), classes=['dogs', 'cats'], batch_size=10)

flow_from_directory()
以文件夹路径为参数,生成经过数据提升/归一化后的数据,在一个无限循环中无限制生产批次资料。

classes
是可选参数,为子文件夹的列表,如上我们分别为['dogs','cats']的分类,默认为无。若未提供,则该类别列表将从目录下的子文件夹名称/结构自动推断。每一个子文件夹都会被认为是一个新的类!

target_size=(224,224)
整数元组默认为(256,256),图像将被调整大小成该尺寸,因为我基于VGG16模型,所以这里设定为(224,224)。

输出:

Found 40 images belonging to 2 classes.Found 10 images belonging to 2 classes.Found 10 images belonging to 2 classes.

可以来看一下大小:

print(train_batches.image_shape)# (224, 224, 3)

vgg16模型

引入keras内VGG16模型,若为第一次引入使用,则需要稍等待下载:

vgg16_model = keras.applications.vgg16.VGG16()

建立自己的模型:

model = Sequential()model.summary()

可以看到:

_________________________________________________________________Layer (type)                 Output Shape              Param #   =================================================================Total params: 0Trainable params: 0Non-trainable params: 0_________________________________________________________________

接着我们将VGG16的layer加到我们的model内:

for layer in vgg16_model.layers:    model.add(layer)model.summary()

可以看到结果:

_________________________________________________________________Layer (type)                 Output Shape              Param #   =================================================================input_1 (InputLayer)         (None, 224, 224, 3)       0         _________________________________________________________________block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      _________________________________________________________________block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     _________________________________________________________________block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         _________________________________________________________________block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     _________________________________________________________________block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    _________________________________________________________________block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         _________________________________________________________________block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    _________________________________________________________________block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    _________________________________________________________________block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    _________________________________________________________________block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         _________________________________________________________________block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   _________________________________________________________________block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   _________________________________________________________________block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   _________________________________________________________________block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         _________________________________________________________________block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   _________________________________________________________________block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   _________________________________________________________________block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   _________________________________________________________________block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         _________________________________________________________________flatten (Flatten)            (None, 25088)             0         _________________________________________________________________fc1 (Dense)                  (None, 4096)              102764544 _________________________________________________________________fc2 (Dense)                  (None, 4096)              16781312  _________________________________________________________________predictions (Dense)          (None, 1000)              4097000   =================================================================Total params: 138,357,544Trainable params: 138,357,544Non-trainable params: 0_________________________________________________________________

将顶层predictions拿掉,基本上迁移式学习不只是删掉原本输出,不过因为资料类型的关係这里只把最后一层删掉重新训练。

model.layers.pop()

因为VGG16原本有1000个分类,但这里我们要改成2种分类:
改变之前layer的trainable参数为False,因为我们现在只需要训练最后一层。

for layer in model.layers:    layer.trainable = False

最后加上最后一层:
可以看到这里我们是改成2,因为我们只要01,分别代表就行了:

model.add(Dense(2, activation='softmax'))
Layer (type)                 Output Shape              Param #   =================================================================input_2 (InputLayer)         (None, 224, 224, 3)       0         _________________________________________________________________block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      _________________________________________________________________block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     _________________________________________________________________block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         _________________________________________________________________block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     _________________________________________________________________block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    _________________________________________________________________block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         _________________________________________________________________block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    _________________________________________________________________block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    _________________________________________________________________block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    _________________________________________________________________block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         _________________________________________________________________block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   _________________________________________________________________block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   _________________________________________________________________block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   _________________________________________________________________block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         _________________________________________________________________block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   _________________________________________________________________block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   _________________________________________________________________block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   _________________________________________________________________block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         _________________________________________________________________flatten (Flatten)            (None, 25088)             0         _________________________________________________________________fc1 (Dense)                  (None, 4096)              102764544 _________________________________________________________________fc2 (Dense)                  (None, 4096)              16781312  _________________________________________________________________dense_5 (Dense)              (None, 2)                 2002      =================================================================Total params: 134,262,546Trainable params: 2,002Non-trainable params: 134,260,544_________________________________________________________________

Compile模型:

model.compile(Adam(lr=.00002122), loss='categorical_crossentropy', metrics=['accuracy'])

开始训练啦!

设定里面的参数,包含训练的以及验证的资料集,epochs设定为10次,因为资料量小,所以steps_per_epochvalidation_steps我不需要设定太大,这边就依照你的资料量大小改变啰!

model.fit_generator(train_batches, steps_per_epoch=10, validation_data=valid_batches, validation_steps=4, epochs=10, verbose=2)

等待完成:

Epoch 1/10 - 6s - loss: 0.6749 - acc: 0.7756 - val_loss: 0.6456 - val_acc: 1.0000Epoch 2/10 - 2s - loss: 0.6679 - acc: 0.7856 - val_loss: 0.6674 - val_acc: 0.7143Epoch 3/10 - 2s - loss: 0.6586 - acc: 0.8672 - val_loss: 0.6757 - val_acc: 0.7143Epoch 4/10 - 2s - loss: 0.6547 - acc: 0.8980 - val_loss: 0.6460 - val_acc: 1.0000Epoch 5/10 - 2s - loss: 0.6461 - acc: 0.9898 - val_loss: 0.6455 - val_acc: 1.0000Epoch 6/10 - 2s - loss: 0.6443 - acc: 1.0000 - val_loss: 0.6441 - val_acc: 1.0000Epoch 7/10 - 2s - loss: 0.6443 - acc: 1.0000 - val_loss: 0.6475 - val_acc: 1.0000Epoch 8/10 - 2s - loss: 0.6442 - acc: 1.0000 - val_loss: 0.6512 - val_acc: 0.9286Epoch 9/10 - 2s - loss: 0.6440 - acc: 1.0000 - val_loss: 0.6583 - val_acc: 0.8571Epoch 10/10 - 2s - loss: 0.6445 - acc: 1.0000 - val_loss: 0.6511 - val_acc: 0.9286

因为资料量不多,所以非常的快,可以看到acc準确度到100%了,非常开心。并不是每次训练都会非常的快,所以记得在最后面都要写上储存的方法把模型存起来才行啊!

储存model:

model.save("cat-dog-model-base-VGG16.h5")

好的,那这就是今天的介绍!非常感谢大家的收看!(好久没有写文章了。)

更多参考文件:

关于图片处理:
keras Image preprocessing
关于预训练模型:
keras Applications


关于作者: 网站小编

码农网专注IT技术教程资源分享平台,学习资源下载网站,58码农网包含计算机技术、网站程序源码下载、编程技术论坛、互联网资源下载等产品服务,提供原创、优质、完整内容的专业码农交流分享平台。

热门文章