Neural networks are very trendy nowadays and it could be a good idea to understand how they work. Luckily, we can rely on a Python library called
sklearn
and available here.
We’re going to use neural networks to interpolate between two sets of data. The training set is composed of two matrices, specifically an input matrix and an output matrix. The former one contains a set of inputs (one per row). The expected output is in the corresponding row of the output matrix. Both matrices are the training set of the neural network.
The input/output interpolation is performed with a small class which uses the sklearn
framework. The main objective is to have a N-dimensional interpolation tool. The complete implementation of the class is available at the end of the page but here two methods are described.
class NnInterpolator (object): def __init__(self, input_matrix=None, output_matrix=None, calc_now=False): """ input_matrix = nb x ni (number of data-points \times number of input params) point1: [par1, par2, par3 ...] point2: [par1, par2, par3 ...] [[par1, par2, par3 ...], [par1, par2, par3 ...]] output_matrix = nb x no (number of data-points \times number of output vars) point1: [var1, var2, var3 ...] point2: [var1, var2, var3 ...] [[var1, var2, var3 ...], [var1, var2, var3 ...]] calc_now = when to calculate the matrix inversion """ ... def __call__(self, new_input_data): ...
The first one performs the class initialisation. It requires the input matrix and the output matrix. Each row of the matrices correspond to just one combination of parameters and output values. Once the initialisation is complete, we can use the interpolator by calling the object like a function. This trigger the __call__
method. The class is defined in the nnint.py
file.
A simple example
Let’s teach a neural network how to calculate the square of a real number. We have two inputs (two numbers) and we want the neural network to calculate their squared values. Since we’re going to use Python with the numpy
library, we have to write
import numpy as np
and we’ll use also the NnInterpolator
class, so we have to import it
from nnint import *
We then declare two variables which define the number of inputs and outputs
n_inputs = 2 n_outputs = 2
In addition, we need to define the size of the training set, i.e. the number of input/output combinations:
n_samples = 25000
At this point, we’re going to generate the training data. Specifically, we’re going to use a random list of numbers in the range \([0, 1]\) and another list containing their squared values. First, we initialise the random seed to generate random numbers with
np.random.seed()
and then the input and output matrices are obtained as
input_matrix = np.random.rand(n_samples, n_inputs) output_matrix = input_matrix[:,0:n_outputs]**2
The interpolator object is instantiated by providing both matrices
interpolator = NnInterpolator(input_matrix, output_matrix, calc_now=True)
Now, let’s suppose we want to calculate the square of 0.25 and 0.5. This can be done with
test_in = np.array([[0.25, 0.5]]) test_out = interpolator(test_in)
Results can be printed on the screen with
print(test_in) print(test_out)
The final script is:
import numpy as np from nnint import * n_inputs = 2 n_outputs = 2 n_samples = 2500 np.random.seed() input_matrix = np.random.rand(n_samples, n_inputs) output_matrix = input_matrix[:,0:n_outputs]**2 interpolator = NnInterpolator(input_matrix, output_matrix, calc_now=True) test_in = np.array([[0.25, 0.5]]) test_out = interpolator(test_in) print(test_in) print(test_out)
Choosing the number of hidden layers and their size
No exact rule exists for choosing the number of hidden layers. A rule of thumb can be obtained from literature:
- Size of inner and outer layers are known because they depend on the number of inputs and outputs;
- Just one hidden layer is needed. A neural network with multiple hidden layers can be described as a neural network with just one hidden layer;
- The number of neurons of the hidden layer should be an average value between the size of inner and outer layers;
- The upper bound to not over-fit is
$$
N_h = \frac{N_s}{\alpha (N_i + N_o)}
$$
where \(N_h\) is the number of neurones of the hidden layer, \(N_i\) is the number of inputs, \(N_o\) is the number of outputs and \(\alpha \in [2,10]\). Regarding \(\alpha\), it is a measure of the number of independent parameters in your data.
The NnInterpolator class
These are the contents of the file nnint.py
.
import numpy as np from sklearn.neural_network import MLPClassifier, MLPRegressor from sklearn.preprocessing import StandardScaler class NnInterpolator (object): """ Class implementing Neural Network interpolation """ def __init__(self, input_matrix=None, output_matrix=None, calc_now=False): """ input_matrix = nb x ni (number of data-points \times number of input params) point1: [par1, par2, par3 ...] point2: [par1, par2, par3 ...] [[par1, par2, par3 ...], [par1, par2, par3 ...]] output_matrix = nb x no (number of data-points \times number of output vars) point1: [var1, var2, var3 ...] point2: [var1, var2, var3 ...] [[var1, var2, var3 ...], [var1, var2, var3 ...]] calc_now = when to calculate the matrix inversion """ self._nb = None self._ni = None self._no = None self._input_matrix = input_matrix self._output_matrix = output_matrix if (input_matrix is not None) and (output_matrix is not None): self._nb = input_matrix.shape[0] self._ni = input_matrix.shape[1] if self._nb != output_matrix.shape[0]: raise ValueError("NB should be the same!") self._no = output_matrix.shape[1] if calc_now: self._calc_interpolation() def __call__(self, new_input_data): if len(new_input_data.shape) > 1: return self.interp_matrix(new_input_data) else: return self.interp_vector(new_input_data) def _calc_interpolation(self): self._scaler_in = StandardScaler() self._scaler_in.fit(self._input_matrix) self._scaler_out = StandardScaler() self._scaler_out.fit(self._output_matrix) self._clf = MLPRegressor(activation='logistic', max_iter=1000, tol=1e-30, hidden_layer_sizes=(100,100,100)) self._clf.fit(self._scaler_in.transform(self._input_matrix), self._scaler_out.transform(self._output_matrix)) def interp_vector(self, unscaled_new_input_vector): new_input_vector = self._scaler_in.transform(unscaled_new_input_vector.reshape(1,-1)) output = self._clf.predict(new_input_vector) unscaled_output = self._scaler_out.inverse_transform(output) return unscaled_output def interp_matrix(self, input_matrix): disp = np.zeros((input_matrix.shape[0], self._no)) for row in range(input_matrix.shape[0]): disp[row, :] = self.interp_vector(input_matrix[row, :]) return disp