Case Study: Non-linear Artificial Neural Network (ANN)

A non-linear artificial neural network (ANN) was the final statistical approach evaluated. Jordan and Bishop (1996) describe neural networks as a graph with

image029

Fig. 2.8 Topological structure of an ANN (Haykin 1994)

patterns, represented by numerical values attached to the nodes of the graph, and transformation between patterns achieved via message passing algorithms. The power of an ANN is rooted in the fact that it is designed to replicate a biological neural system. An advantage of using this methodology is that an ANN has the ability to learn and adapt to the underlying structure of the data set being analysed. The ANN approach is also capable of handling non-normality, non-linearity, and collinearity within a data set of interest (Haykin 1994). Typically, an ANN is trained using a sample or training set that consists of both dependent and independent variables. In this study the training data set was extracted from the 225 sample plots and included maximum LiDAR heights as dependent variable and mean IKONOS reflectance and compartment age as independent variables.

The structure of an ANN is made up of several layers (Fig. 2.8), most notably an input layer (containing the training data), a number of hidden layers, and an output layer. Each layer is in turn made up of a number of neurons, which represent the fundamental processing unit of any ANN. A neuron consists of three parts (Fig. 2.9), namely a set of synapses or connecting links, an adder or transfer function, and an activation function. For this study we first normalised (subtract mean and divide by the standard deviation) spectral reflectance and age before introducing these independent variables to the neurons via the synapses or connection links.

Inputs were standardised (0-1) to facilitate inclusion of the age variable and weights were initially set using a random seed value. Optimisation of the network was undertaken using bootstrapping methods, with 50 % of the training data removed from the training set and used to assess the accuracy of the network weights. Fifty bootstrapped models were calculated for each epoch, with a total of 500 epochs specified. The bootstrapping approach allowed for the calculation of mean effects of each input variable (confidence interval), similar to the standard

image030

Fig. 2.9 Structure of a neuron (Haykin 1994)

output of a multiple regression analysis. Goodness of fit statistics, similar to those reported in regression models, was used to assess the model. Finally, the neural network was applied directly to the population data to create spatially explicit maps of LiDAR height after model training was completed.