Using early stopping my model stops training at around 7 epochs because of overfitting
MAX_SEQUENCE_LENGTH = 1000
MAX_NUM_WORDS = 20000
EMBEDDING_DIM = 100
VALIDATION_SPLIT = 0.2
output_nodes = 759
embedding_layer = Embedding(num_words,
EMBEDDING_DIM,
embeddings_initializer=Constant(embedding_matrix),
input_length=MAX_SEQUENCE_LENGTH,
trainable=False)
print('Training model.')
output_nodes = y_train.shape[1]
# train a 1D convnet with global maxpooling
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
x = Conv1D(128, 5, activation='relu')(embedded_sequences)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation='relu')(x)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation='relu')(x)
x = GlobalMaxPooling1D()(x)
x = Dense(128, activation='relu')(x)
preds = Dense(output_nodes, activation='softmax')(x)
model = Model(sequence_input, preds)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
I tried increasing the input nodes, reducing the batch size and using the k-folds method to improve performance. However, can't get the accuracy to cross 50%. Any thoughts on how can I achieve higher accuracy? I am trying to predict authors based on text. My data has 98k rows and 4 columns.
UPDATE:
I made the following changes and now the learning is very slow and it is performing much more badly. Am I tweaking the parameters correctly?
sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer='sgd',
metrics=['acc'])
print("Model compiled")
es = EarlyStopping(patience=3)
rlrp = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=2, min_delta=1E-7)
model.fit(x_train, y_train,
batch_size=128,
epochs=30,
validation_data=(x_val, y_val), verbose=2, callbacks=[es, rlrp])