Skip to content

MLP

train_MLP_classifier(X, y, neurons, validation_split=0.2, validation_data=None, activation='relu', output_neurons=1, last_activation='sigmoid', epochs=50, batch_size=32, optimizer='adam', learning_rate=0.001, loss_function='binary_crossentropy', dropout_rate=None, early_stopping=True, es_patience=5, metrics=['accuracy'], random_state=None)

Train MLP (Multilayer Perceptron) using Keras.

Creates a Sequential model with Dense NN layers. For each element in neurons, Dense layer with corresponding dimensionality/neurons is created with the specified activation function (activation). If dropout_rate is specified, a Dropout layer is added after each Dense layer.

Parameters default to a binary classification model using sigmoid as last activation, binary crossentropy as loss function and 1 output neuron/unit.

For more information about Keras models, read the documentation here: https://keras.io/.

Parameters:

Name Type Description Default
X ndarray

Input data. Should be a 2-dimensional array where each row represents a sample and each column a feature. Features should ideally be normalized or standardized.

required
y ndarray

Target labels. For binary classification, y should be a 1-dimensional array of binary labels (0 or 1). For multi-class classification, y should be a 2D array with one-hot encoded labels. The number of columns should match the number of classes.

required
neurons Sequence[int]

Number of neurons in each hidden layer.

required
validation_split Optional[float]

Fraction of data used for validation during training. Value must be > 0 and < 1 or None. Defaults to 0.2.

0.2
validation_data Optional[Tuple[ndarray, ndarray]]

Separate dataset used for validation during training. Overrides validation_split if provided. Expected data form is (X_valid, y_valid). Defaults to None.

None
activation Literal[relu, linear, sigmoid, tanh]

Activation function used in each hidden layer. Defaults to 'relu'.

'relu'
output_neurons int

Number of neurons in the output layer. Defaults to 1.

1
last_activation Literal[sigmoid, softmax]

Activation function used in the output layer. Defaults to 'sigmoid'.

'sigmoid'
epochs int

Number of epochs to train the model. Defaults to 50.

50
batch_size int

Number of samples per gradient update. Defaults to 32.

32
optimizer Literal[adam, adagrad, rmsprop, sdg]

Optimizer to be used. Defaults to 'adam'.

'adam'
learning_rate Number

Learning rate to be used in training. Value must be > 0. Defalts to 0.001.

0.001
loss_function Literal[binary_crossentropy, categorical_crossentropy]

Loss function to be used. Defaults to 'binary_crossentropy'.

'binary_crossentropy'
dropout_rate Optional[Number]

Fraction of the input units to drop. Value must be >= 0 and <= 1. Defaults to None.

None
early_stopping bool

Whether or not to use early stopping in training. Defaults to True.

True
es_patience int

Number of epochs with no improvement after which training will be stopped. Defaults to 5.

5
metrics Optional[Sequence[Literal[accuracy, precision, recall, f1_score]]]

Metrics to be evaluated by the model during training and testing. Defaults to ['accuracy'].

['accuracy']
random_state Optional[int]

Seed for random number generation. Sets Python, Numpy and Tensorflow seeds to make program deterministic. Defaults to None (random state / seed).

None

Returns:

Type Description
Tuple[Model, dict]

Trained MLP model and training history.

Raises:

Type Description
InvalidParameterValueException

Some of the numeric parameters have invalid values.

InvalidDataShapeException

Shape of X or y is invalid.

Source code in eis_toolkit/prediction/mlp.py
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
@beartype
def train_MLP_classifier(
    X: np.ndarray,
    y: np.ndarray,
    neurons: Sequence[int],
    validation_split: Optional[float] = 0.2,
    validation_data: Optional[Tuple[np.ndarray, np.ndarray]] = None,
    activation: Literal["relu", "linear", "sigmoid", "tanh"] = "relu",
    output_neurons: int = 1,
    last_activation: Literal["sigmoid", "softmax"] = "sigmoid",
    epochs: int = 50,
    batch_size: int = 32,
    optimizer: Literal["adam", "adagrad", "rmsprop", "sdg"] = "adam",
    learning_rate: Number = 0.001,
    loss_function: Literal["binary_crossentropy", "categorical_crossentropy"] = "binary_crossentropy",
    dropout_rate: Optional[Number] = None,
    early_stopping: bool = True,
    es_patience: int = 5,
    metrics: Optional[Sequence[Literal["accuracy", "precision", "recall", "f1_score"]]] = ["accuracy"],
    random_state: Optional[int] = None,
) -> Tuple[keras.Model, dict]:
    """
    Train MLP (Multilayer Perceptron) using Keras.

    Creates a Sequential model with Dense NN layers. For each element in `neurons`, Dense layer with corresponding
    dimensionality/neurons is created with the specified activation function (`activation`). If `dropout_rate` is
    specified, a Dropout layer is added after each Dense layer.

    Parameters default to a binary classification model using sigmoid as last activation, binary crossentropy as loss
    function and 1 output neuron/unit.

    For more information about Keras models, read the documentation here: https://keras.io/.

    Args:
        X: Input data. Should be a 2-dimensional array where each row represents a sample and each column a
            feature. Features should ideally be normalized or standardized.
        y: Target labels. For binary classification, y should be a 1-dimensional array of binary labels (0 or 1).
            For multi-class classification, y should be a 2D array with one-hot encoded labels. The number of columns
            should match the number of classes.
        neurons: Number of neurons in each hidden layer.
        validation_split: Fraction of data used for validation during training. Value must be > 0 and < 1 or None.
            Defaults to 0.2.
        validation_data: Separate dataset used for validation during training. Overrides validation_split if
            provided. Expected data form is (X_valid, y_valid). Defaults to None.
        activation: Activation function used in each hidden layer. Defaults to 'relu'.
        output_neurons: Number of neurons in the output layer. Defaults to 1.
        last_activation: Activation function used in the output layer. Defaults to 'sigmoid'.
        epochs: Number of epochs to train the model. Defaults to 50.
        batch_size: Number of samples per gradient update. Defaults to 32.
        optimizer: Optimizer to be used. Defaults to 'adam'.
        learning_rate: Learning rate to be used in training. Value must be > 0. Defalts to 0.001.
        loss_function: Loss function to be used. Defaults to 'binary_crossentropy'.
        dropout_rate: Fraction of the input units to drop. Value must be >= 0 and <= 1. Defaults to None.
        early_stopping: Whether or not to use early stopping in training. Defaults to True.
        es_patience: Number of epochs with no improvement after which training will be stopped. Defaults to 5.
        metrics: Metrics to be evaluated by the model during training and testing. Defaults to ['accuracy'].
        random_state: Seed for random number generation. Sets Python, Numpy and Tensorflow seeds to make
            program deterministic. Defaults to None (random state / seed).

    Returns:
        Trained MLP model and training history.

    Raises:
        InvalidParameterValueException: Some of the numeric parameters have invalid values.
        InvalidDataShapeException: Shape of X or y is invalid.
    """
    # 1. Check input data
    _check_ML_model_data_input(X=X, y=y)
    _check_MLP_inputs(
        neurons=neurons,
        validation_split=validation_split,
        learning_rate=learning_rate,
        dropout_rate=dropout_rate,
        es_patience=es_patience,
        batch_size=batch_size,
        epochs=epochs,
        output_neurons=output_neurons,
        loss_function=loss_function,
    )

    if random_state is not None:
        keras.utils.set_random_seed(random_state)

    # 2. Create and compile a sequential model
    model = keras.Sequential()

    model.add(keras.layers.Input(shape=(X.shape[1],)))

    for neuron in neurons:
        model.add(keras.layers.Dense(units=neuron, activation=activation))

        if dropout_rate is not None:
            model.add(keras.layers.Dropout(dropout_rate))

    model.add(keras.layers.Dense(units=output_neurons, activation=last_activation))

    model.compile(
        optimizer=_keras_optimizer(optimizer, learning_rate=learning_rate), loss=loss_function, metrics=metrics
    )

    # 3. Train the model
    # Early stopping callback
    callbacks = [keras.callbacks.EarlyStopping(monitor="val_loss", patience=es_patience)] if early_stopping else []

    history = model.fit(
        X,
        y,
        epochs=epochs,
        validation_split=validation_split if validation_split else 0.0,
        validation_data=validation_data,
        batch_size=batch_size,
        callbacks=callbacks,
    )

    return model, history.history

train_MLP_regressor(X, y, neurons, validation_split=0.2, validation_data=None, activation='relu', output_neurons=1, last_activation='linear', epochs=50, batch_size=32, optimizer='adam', learning_rate=0.001, loss_function='mse', dropout_rate=None, early_stopping=True, es_patience=5, metrics=['mse'], random_state=None)

Train MLP (Multilayer Perceptron) using Keras.

Creates a Sequential model with Dense NN layers. For each element in neurons, Dense layer with corresponding dimensionality/neurons is created with the specified activation function (activation). If dropout_rate is specified, a Dropout layer is added after each Dense layer.

For more information about Keras models, read the documentation here: https://keras.io/.

Parameters:

Name Type Description Default
X ndarray

Input data. Should be a 2-dimensional array where each row represents a sample and each column a feature. Features should ideally be normalized or standardized.

required
y ndarray

Target labels. Should be a 1-dimensional array where each entry corresponds to the continuous target value for the respective sample in X.

required
neurons Sequence[int]

Number of neurons in each hidden layer.

required
validation_split Optional[float]

Fraction of data used for validation during training. Value must be > 0 and < 1 or None. Defaults to 0.2.

0.2
validation_data Optional[Tuple[ndarray, ndarray]]

Separate dataset used for validation during training. Overrides validation_split if provided. Expected data form is (X_valid, y_valid). Defaults to None.

None
activation Literal[relu, linear, sigmoid, tanh]

Activation function used in each hidden layer. Defaults to 'relu'.

'relu'
output_neurons int

Number of neurons in the output layer. Defaults to 1.

1
last_activation Literal[linear]

Activation function used in the output layer. Defaults to 'linear'.

'linear'
epochs int

Number of epochs to train the model. Defaults to 50.

50
batch_size int

Number of samples per gradient update. Defaults to 32.

32
optimizer Literal[adam, adagrad, rmsprop, sdg]

Optimizer to be used. Defaults to 'adam'.

'adam'
learning_rate Number

Learning rate to be used in training. Value must be > 0. Defalts to 0.001.

0.001
loss_function Literal[mse, mae, hinge, huber]

Loss function to be used. Defaults to 'mse'.

'mse'
dropout_rate Optional[Number]

Fraction of the input units to drop. Value must be >= 0 and <= 1. Defaults to None.

None
early_stopping bool

Whether or not to use early stopping in training. Defaults to True.

True
es_patience int

Number of epochs with no improvement after which training will be stopped. Defaults to 5.

5
metrics Optional[Sequence[Literal[mse, rmse, mae]]]

Metrics to be evaluated by the model during training and testing. Defaults to ['mse'].

['mse']
random_state Optional[int]

Seed for random number generation. Sets Python, Numpy and Tensorflow seeds to make program deterministic. Defaults to None (random state / seed).

None

Returns:

Type Description
Tuple[Model, dict]

Trained MLP model and training history.

Raises:

Type Description
InvalidParameterValueException

Some of the numeric parameters have invalid values.

InvalidDataShapeException

Shape of X or y is invalid.

Source code in eis_toolkit/prediction/mlp.py
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
@beartype
def train_MLP_regressor(
    X: np.ndarray,
    y: np.ndarray,
    neurons: Sequence[int],
    validation_split: Optional[float] = 0.2,
    validation_data: Optional[Tuple[np.ndarray, np.ndarray]] = None,
    activation: Literal["relu", "linear", "sigmoid", "tanh"] = "relu",
    output_neurons: int = 1,
    last_activation: Literal["linear"] = "linear",
    epochs: int = 50,
    batch_size: int = 32,
    optimizer: Literal["adam", "adagrad", "rmsprop", "sdg"] = "adam",
    learning_rate: Number = 0.001,
    loss_function: Literal["mse", "mae", "hinge", "huber"] = "mse",
    dropout_rate: Optional[Number] = None,
    early_stopping: bool = True,
    es_patience: int = 5,
    metrics: Optional[Sequence[Literal["mse", "rmse", "mae"]]] = ["mse"],
    random_state: Optional[int] = None,
) -> Tuple[keras.Model, dict]:
    """
    Train MLP (Multilayer Perceptron) using Keras.

    Creates a Sequential model with Dense NN layers. For each element in `neurons`, Dense layer with corresponding
    dimensionality/neurons is created with the specified activation function (`activation`). If `dropout_rate` is
    specified, a Dropout layer is added after each Dense layer.

    For more information about Keras models, read the documentation here: https://keras.io/.

    Args:
        X: Input data. Should be a 2-dimensional array where each row represents a sample and each column a
            feature. Features should ideally be normalized or standardized.
        y: Target labels. Should be a 1-dimensional array where each entry corresponds to the continuous
            target value for the respective sample in X.
        neurons: Number of neurons in each hidden layer.
        validation_split: Fraction of data used for validation during training. Value must be > 0 and < 1 or None.
            Defaults to 0.2.
        validation_data: Separate dataset used for validation during training. Overrides validation_split if
            provided. Expected data form is (X_valid, y_valid). Defaults to None.
        activation: Activation function used in each hidden layer. Defaults to 'relu'.
        output_neurons: Number of neurons in the output layer. Defaults to 1.
        last_activation: Activation function used in the output layer. Defaults to 'linear'.
        epochs: Number of epochs to train the model. Defaults to 50.
        batch_size: Number of samples per gradient update. Defaults to 32.
        optimizer: Optimizer to be used. Defaults to 'adam'.
        learning_rate: Learning rate to be used in training. Value must be > 0. Defalts to 0.001.
        loss_function: Loss function to be used. Defaults to 'mse'.
        dropout_rate: Fraction of the input units to drop. Value must be >= 0 and <= 1. Defaults to None.
        early_stopping: Whether or not to use early stopping in training. Defaults to True.
        es_patience: Number of epochs with no improvement after which training will be stopped. Defaults to 5.
        metrics: Metrics to be evaluated by the model during training and testing. Defaults to ['mse'].
        random_state: Seed for random number generation. Sets Python, Numpy and Tensorflow seeds to make
            program deterministic. Defaults to None (random state / seed).

    Returns:
        Trained MLP model and training history.

    Raises:
        InvalidParameterValueException: Some of the numeric parameters have invalid values.
        InvalidDataShapeException: Shape of X or y is invalid.
    """
    # 1. Check input data
    _check_ML_model_data_input(X=X, y=y)
    _check_MLP_inputs(
        neurons=neurons,
        validation_split=validation_split,
        learning_rate=learning_rate,
        dropout_rate=dropout_rate,
        es_patience=es_patience,
        batch_size=batch_size,
        epochs=epochs,
        output_neurons=output_neurons,
        loss_function=loss_function,
    )

    if random_state is not None:
        keras.utils.set_random_seed(random_state)

    # 2. Create and compile a sequential model
    model = keras.Sequential()

    model.add(keras.layers.Input(shape=(X.shape[1],)))

    for neuron in neurons:
        model.add(keras.layers.Dense(units=neuron, activation=activation))

        if dropout_rate is not None:
            model.add(keras.layers.Dropout(dropout_rate))

    model.add(keras.layers.Dense(units=output_neurons, activation=last_activation))

    model.compile(
        optimizer=_keras_optimizer(optimizer, learning_rate=learning_rate), loss=loss_function, metrics=metrics
    )

    # 3. Train the model
    # Early stopping callback
    callbacks = [keras.callbacks.EarlyStopping(monitor="val_loss", patience=es_patience)] if early_stopping else []

    history = model.fit(
        X,
        y,
        epochs=epochs,
        validation_split=validation_split if validation_split else 0.0,
        validation_data=validation_data,
        batch_size=batch_size,
        callbacks=callbacks,
    )

    return model, history.history