Artificial neural network implementation

For each of the four {ss} time series a specific ANN is trained with an improved Back Propagation (BP) algorithm [9]. In the present work an ANN with a simple optional feedback connection was used (Figure 2, left line).The feedback line transforms the used Feed Forward (FF) ANN in a Recurrent Neural Network (RNN). The blank rectangles in figure 2 symbolize the activation functions, those one with z-1 a one day delay, and those one with the unity represent the unity inputs for the bias weight connections. Each layer of the ANN includes dendrite connections with its weights, designed by sloped lines. The dendrite summation point of the neurons is designed by circles and the output activation functions by blank rectangles. A bipolar sigmoidal activation function for the neurons in the hidden layers and a bipolar linear activation function for the output neuron were applied. The input and output signal were normalized to appear in the range (-1…1) utilizing the normalization equations in [9]. By a conventional BP algorithms, the weights [wu] at iteration step u are updated as a function of the matrix [n {5u} {yuT } ] of eqn. (4). Whereby {5u} is the propagated error at the output of an arbitrary layer of the ANN, n is the pre-adjustable learning rate and {yu} is the output of the previous layer.

[ w u+i ] = [ Wu ] + nx [ {5u} (VuT } ] + a [ w u — w u-i ] (4)

As the present ANN has only one neuron in the output layer, {5u} is a variable (5u) and the weights matrixes in eqn. 4 are all vectors. In order to improve the convergence speed online training, rather than batch training is used, where the ANN weights are updated for each daily mean [30].

+

image117

Figure 2 — Circuit of the utilized ANN during its training phase utilizing BP with fully dendrite connections in between the layers (Observation: In present article an additional hidden layers of neurons is used, but for the simplification of the scheme, the ANN is designed with only one hidden layer)

By the BP the error energy E (eqn. 3) is propagated back due to partial differentiations, hence sAsi = sAsu = 5u is obtained at the ANN output layer [30] (Figure 2). If during the training 5u = f(u) a local minimum of 5u is separated from the general minimum by high walls, with high A5u = f (Au) gradients, the algorithm may need too many steps to climb the walls moving out of the local

minimum and it runs the risk of being trapped [30]. Therefore were used as learning rates px, two distinct pre-adjustable values, one p(-A5u) = 0.008 for decreasing 5u residuals and another p(A5u) = 0.013 for increasing 5u. The former is used to minimize the uncertainties by learning and the latter enables the algorithm to climb the walls more quickly by increasing residuals in order to search the global minimum. An adjustable momentum factor a = 0.8, increases additionally the weight actualization (eqn. 4) and thus the learning speed, at locations where the learning process occurs with more success. These locations are identified by the weight modification gradient, of the last two learning steps [wt — wt-1]. For higher gradients the matrix a [w t — w t-1] accomplish higher weight modifications and vice versa. The decrease of the weight actualization avoids that the algorithm jumps over a narrow global minimum and therefore increases the stability of the learning process [30].