Case Study: Enumeration Data

Field data were collected during spring (October) 2006 in compartments of com­mercial plantations ranging in age from 4 to 9 years old. A total of 61 plots were sampled in 19 different stands. Plots were established no closer than 50 m to each other to ensure that the sample data set captured as much variability as possible in the chosen stand. The geographic location of the plot centre was recorded using DGPS. A circular plot was then established with an initial radius of 15 m, which was adjusted based on the slope of the compartment. The diameter at breast height (DBH) was recorded for each tree with a DBH larger than 5 cm, in every plot. Tree height was measured using a Vertex hypsometer with at least 15 DBH — height pairs recorded for each plot. The DBH-height pair data were employed to develop regression equations which were used to compute tree height for all trees based on the DBH (R2 = 0.98, p < 0.001). Several height metrics were subsequently calculated for each plot. The data collected were used for validation purposes only. Data were collected using a stratified sampling scheme applied to LiDAR height data and informed by an empirical semi-variogram analysis, as described below.

2.5.1 Case Study: Semi-variogram Modelling and Sampling

The selection of an appropriate LiDAR sample size is critical, since the samples should in theory capture as much canopy variability as possible within each compartment. Richards et al. (2000) provide the following criteria when designing a sampling strategy: The scheme should be sensor independent, unbiased estimators and error variance should be computed, areas not of interest should be excluded, the sample should be selected from areas where change is expected to be high, and adjacent plots should not contain redundant information and should be systemati­cally spaced. The area sampled for this study consisted of 1,000 ha of plantation forests made up of Eucalyptus, Pinus, and Australian Acacia species. The species and age groups were defined prior to setting out the sampling strategy, thus the target population had already been defined as Eucalyptus compartments between the ages of 4 and 9 years old. Compartments satisfying these initial criteria were selected using a geographical information system based on information from the forest company. LiDAR canopy returns, normalised by a ground digital elevation model, were subsequently selected and added to the analysis data set for the defined compartments. Normalisation of LiDAR returns was necessary in order to account for varying ground height, i. e., the process involved subtraction of the associated DEM values from first return LiDAR point heights. The selection of a suitable sample with which to model tree height was then undertaken using geostatistical methods.

The criteria stipulated by Richards et al. (2000) indicate that the scheme should be sensor independent. In this case the LiDAR sensor was a standard two return system, which constitutes the lower limit in terms of LiDAR technology currently in use. Richards et al. (2000) furthermore state that the variance of the sample estimates should be calculated. We calculated sample and population statistics and evaluated these based on various descriptive statistical measures. However, the sampling scheme first needed to be designed and implemented. The geostatistical methods discussed above, which measured and quantified the spatial dependence of LiDAR canopy height returns, were employed. It was deemed imperative that the sample from the LiDAR height data set should capture as much of the vertical variability as possible present in the compartment of interest. Semi-variograms and their application to LiDAR canopy height returns represent an ideal tool to do just this (Butson and King 1999).

Semi-variograms were derived by first calculating an empirical semi-variogram. Empirical semi-variograms measure the spatial dependence of neighbouring obser­vations for any continuously varying phenomenon and have the advantage of relating key descriptors of the spatial statistics, namely range and sill, of the data (Treitz 2001). Range is especially important in this study as it contributed to determining the width of sample transects. The range value, as calculated by the empirical semi-variogram, quantifies the distance at which height values are no longer statistically related (Curran and Atkinson 1998). This implies that if a sample of points is selected based on the range distance (randomly selected training points), this sample should capture most of the canopy height variability present in the compartment (Woodcock et al. 1988a, b). We recorded the average range from semi-variograms for each of the 61 field plots. Spherical mathematical models were used, which were iteratively optimised using the residual sum of squares (Hiemstra et al. 2009).

The identified optimal LiDAR transect width subsequently was doubled to more accurately reflect an operational use of LiDAR transects and to ensure that the transect width captured the majority of height variation in the study area, with the semi-variogram range serving as an indicator of the minimum width. This resulted in pseudo flight lines of the determined width for the imagery and LiDAR data and a between-transect spacing of 150 m to ensure coverage of the entire study area and all compartments (Richards et al. 2000). The pseudo flight lines were further divided into 10 m2 blocks, which were sub-sampled to ensure that samples were systematically spaced in order to minimise the potential for redundancy within the sample data set. Training data for modelling purposes eventually were selected from transects located in 19 Eucalyptus compartments. Maximum LiDAR height values (dependent variable) were extracted, since these are recognised as being representative of actual tree height in closed canopy forests (Popescu et al. 2002). Co-located multispectral data were also extracted with the mean spectral reflectance (independent variable) used for modelling maximum LiDAR height.