记得我们之前讲过1D卷积在自然语言处理中的应用:
但是读者中对如何应用一维卷积呼声太高,David 9 有必要再用一篇幅来讲1D卷积实战。这次我们拿kaggle上叶子分类预测问题做例子,讲解1D卷积网络如何实现。
我的数据集来自:www.kaggle.com/alexanderla…
如果懒得去原链接下载,可以直接戳下面链接:
train.csv 是训练集数据,test.csv 是验证数据集。 这个数据集是叶子leaf 品种的分类问题,有三个通道, 每个通道64个比特位,一个通道代表边界特征,一个通道代表形状特征,最后一个通道代表材质特征。(这些特征都是kaggle已经帮你提取了)。输出是叶子特征标签的预测。
下面废话少说,直接上代码及详细注释(亲测可用):
- # -*- coding: utf-8 -*-
- # 导入一些基本库
- import numpy as np
- import pandas as pd
- # LabelEncoder 用来编码输出标签
- from sklearn.preprocessing import LabelEncoder
- from sklearn.preprocessing import StandardScaler
- # StratifiedShuffleSplit可以用来把数据集洗牌,并拆分成训练集和验证集
- from sklearn.model_selection import StratifiedShuffleSplit
- # 我们用的Keras版本是 2.0.1
- from keras.models import Sequential
- from keras.layers import Dense, Activation, Flatten, Convolution1D, Dropout
- from keras.optimizers import SGD
- from keras.utils import np_utils
- # 这个数据集是叶子leaf 品种的分类问题,有三个通道,
- # 每个通道64个比特位,一个通道代表边界特征,一个通道代表形状特征,最后一个通道代表材质特征,
- # 输出是叶子特征标签的预测
- train = pd.read_csv('./train.csv')
- test = pd.read_csv('./test.csv')
- def encode(train, test):
- # 用LabelEncoder为叶子的种类标签编码,labels对象是训练集上的标签列表
- label_encoder = LabelEncoder().fit(train.species)
- labels = label_encoder.transform(train.species)
- classes = list(label_encoder.classes_)
- # 此处把不必要的训练集和测试集的列删除
- train = train.drop(['species', 'id'], axis=1)
- test = test.drop('id', axis=1)
- return train, labels, test, classes
- train, labels, test, classes = encode(train, test)
- # 这里只是标准化训练集的特征值
- scaler = StandardScaler().fit(train.values)
- scaled_train = scaler.transform(train.values)
- # 把数据集拆分成训练集和测试集,测试集占10%
- sss = StratifiedShuffleSplit(test_size=0.1, random_state=23)
- for train_index, valid_index in sss.split(scaled_train, labels):
- X_train, X_valid = scaled_train[train_index], scaled_train[valid_index]
- y_train, y_valid = labels[train_index], labels[valid_index]
- # 每个输入通道的大小是64位,一共3个通道
- nb_features = 64
- nb_class = len(classes)
- # 把输入数据集reshape成keras喜欢的格式:(样本数,通道大小,通道数)
- X_train_r = np.zeros((len(X_train), nb_features, 3))
- # 这里的做法是先把所有元素初始化成0之后,再把刚才的数据集中的数据赋值过来
- X_train_r[:, :, 0] = X_train[:, :nb_features]
- X_train_r[:, :, 1] = X_train[:, nb_features:128]
- X_train_r[:, :, 2] = X_train[:, 128:]
- # 验证集也要reshape一下
- X_valid_r = np.zeros((len(X_valid), nb_features, 3))
- X_valid_r[:, :, 0] = X_valid[:, :nb_features]
- X_valid_r[:, :, 1] = X_valid[:, nb_features:128]
- X_valid_r[:, :, 2] = X_valid[:, 128:]
- # 下面是Keras的一维卷积实现,原作者尝试过多加一些卷积层,
- # 结果并不能提高准确率,可能是因为其单个通道的信息本来就太少,深度太深的网络本来就不适合
- model = Sequential()
- # 一维卷积层用了512个卷积核,输入是64*3的格式
- # 此处要注意,一维卷积指的是卷积核是1维的,而不是卷积的输入是1维的,1维指的是卷积方式
- model.add(Convolution1D(nb_filter=512, filter_length=1, input_shape=(nb_features, 3)))
- model.add(Activation('relu'))
- model.add(Flatten())
- model.add(Dropout(0.4))
- model.add(Dense(2048, activation='relu'))
- model.add(Dense(1024, activation='relu'))
- model.add(Dense(nb_class))
- # softmax经常用来做多类分类问题
- model.add(Activation('softmax'))
- y_train = np_utils.to_categorical(y_train, nb_class)
- y_valid = np_utils.to_categorical(y_valid, nb_class)
- sgd = SGD(lr=0.01, nesterov=True, decay=1e-6, momentum=0.9)
- model.compile(loss='categorical_crossentropy',optimizer=sgd,metrics=['accuracy'])
- model.summary()
- nb_epoch = 15
- model.fit(X_train_r, y_train, nb_epoch=nb_epoch, validation_data=(X_valid_r, y_valid), batch_size=16)
下面是模型summary的输出参考:
- Layer (type) Output Shape Param #
- =================================================================
- conv1d_1 (Conv1D) (None, 64, 512) 2048
- _________________________________________________________________
- activation_1 (Activation) (None, 64, 512) 0
- _________________________________________________________________
- flatten_1 (Flatten) (None, 32768) 0
- _________________________________________________________________
- dropout_1 (Dropout) (None, 32768) 0
- _________________________________________________________________
- dense_1 (Dense) (None, 2048) 67110912
- _________________________________________________________________
- dense_2 (Dense) (None, 1024) 2098176
- _________________________________________________________________
- dense_3 (Dense) (None, 99) 101475
- _________________________________________________________________
- activation_2 (Activation) (None, 99) 0
- =================================================================
- Total params: 69,312,611.0
- Trainable params: 69,312,611.0
- Non-trainable params: 0.0
参考文献:
本文采用署名 – 非商业性使用 – 禁止演绎 3.0 中国大陆许可协议进行许可。著作权属于“David 9的博客”原创,如需转载,请联系微信: david9ml,或邮箱:yanchao727@gmail.com
或直接扫二维码:
赶快成为第一个赞的人吧 The following two tabs change content below.