^{1}

^{*}

^{1}

Modeling of the spatial distribution of tree species based on survey data has recently been applied to conservation planning. Numerous methods have been developed for building species habitat suitability models. The aim of this study was to investigate the suitability of Pleiades satellite data for modeling tree species diversity of Hyrcanian forests in northern Iran (Mazandaran Province). One-hundred sample plots were established over an area of 2.600 ha and surveyed for tree diversity, and the Simpson’s index (D), Shannon’s index (H’) and the reciprocal of Simpson’s index (1/D) were calculated for each plot. Spectral variables and several parameters derived by texture analysis were obtained from multispectral images of the study area and used as predictors of tree diversity of sample plots. Two different methods, including generalized additive models (GAMs) and multivariate adaptive regression splines (MARS), were used for modeling. The results revealed a fairly good prediction of plot tree diversity obtained using the developed models (adj-R^{2} = 0.542-0.731). Shannon’s H’ and Simpson’s 1/D indices were more accurately predicted using GAM-based methods, while MARS models were more suitable for predicting Simpson’s D. We concluded that Pleiades satellite data can be conveniently used for estimating, assessing and monitoring tree species diversity in the mixed hardwood Hyrcanian forest of northern Iran.

The exponential increase of the world population in the last decades has brought about an unparalleled human exploitation of natural resources worldwide, leading to a global reduction of the naturalness of many environments. This may result in the reduction of biodiversity as well as environmental functions and ecological processes which act to generate and maintain soils, convert solar energy into plant tissue, regulate climatic parameters, and provide multiple forest products (

Accurate and practical methods for estimating biodiversity are needed to develop effective strategies for the conservation and management of forests (

The aim of this study was to: (i) investgate the relationships between field-based tree diversity and the spectral and textural features of remote-sensed data (multispectral images of Pleiades satellite); (ii) compare two statistical non-parametric techniques (GAM and MARS) for modeling tree species diversity.

The study area is located within the Hyrcanian forests, District 1 of Darabkola’s forests, Sari, northern Iran (lat. 36° 28’ - 36° 33′ N, long. 53° 16′ - 53° 20′ W -

As species richness and diversity indices depend on the size of the sample plot, phytosociological data were collected based on a systematic sampling method during the period 5 June to 15 July 2010. The size and the number of quadrats were determined based on the species area curve (

The Simpson’s diversity index (D), the Shannon’s diversity index (H′), and the reciprocal of Simpson’s diversity index (1/D) were calculated for each sampling plot based on the proportion of tree species recorded during the field survey. Other indices commonly used for describing the forest structural diversity or the dissimilarity of species across the landscape (

Multispectral images of the Pleiades satellite (Airbus Defence and Space, Munich, Germany - http://www.intelligence-airbusds.com/pleiades/) were acquired in April 2013. All images had a 16-bit radiometric resolution. Geometric correction and orthorectification were applied to images before their use. The geometrical correction of the images was optimized by comparing the image data with vector layers of the roads in the studied area.

Pleiades satellite images have four spectral bands (Blue, Red, Green and NIR) with a spatial resolution of 2 m and a panchromatic (PAN) band with a resolution of 0.5 m. In this study, we considered the aforementioned four spectral bands, the PAN band, the Normalized Difference Vegetation Index (NDVI), and the derived texture features of these bands as predictors in the model analysis (

Texture analysis is one of the most suitable processing methods to estimate the characteristics of the forest structure from remote-sensed data (

All the above derived parameters were calculated over the whole study area at the original pixel resolution.

All the images (for both the original variables and the derived parameters) were aggregated at a resolution of 60 m (consistently with the size of the field plots - 60×60 m) and their pixel values averaged. Average values for each variable were then extracted from the location of each field plot.

Generalized additive models (GAMs) are semi-parametric regression models (

where _{j} are smoothed functions estimated from input data, _{j}(_{j})] = 0 (

The multivariate adaptive regression spline (MARS) method was first introduced by

The available data were randomly split into two subsets, 70% of the data for modeling and 30% for validation and testing. For each tested model, several statistics were recorded, including the squared coefficient of determination (R^{2}) and the adjusted coefficient of determination (adjusted R^{2}). The latter was used to estimate the expected shrinkage in R^{2} due to over-fitting and the inclusion of too many independent variables in the regression model. Thus, when the adjusted R^{2} value is much lower than the R^{2} value, the regression may be over-fitted to the sample, and therefore poorly generalizable.

Model performances were assessed on the validation subset using several regression diagnostics metrics such as the root mean square error (RMSE), relative RMSE, bias and relative bias, calculated as follows (

where _{i} is the value predicted by the model at the _{i} is the observed values at the same pixel and

A high tree species diversity was observed in the study area, as inferred from the three diversity indices obtained from the field survey. The main descriptive statistics of the Simpson’s diversity index (D), the Shannon’s diversity index (H’), and the reciprocal of the Simpson’s diversity index (1/D) in the two datasets (training and validation subsets) are reported in

Regarding the window size used for texture analysis, the highest correlation between texture-derived parameters and all tree diversity indexes was found using the window size 9×9 pixels, which was then used in modeling to extrapolate the texture features of the analyzed spectral variables.

All the models were critically investigated for confounding factors and checked for all basic assumptions. The number of predictor variables entering the models ranged from two to eight, differing among both the models and the diversity indices considered. For example, regarding the index D, the best predictors were: NIR (mean and contrast), Red, and NDVI (variance) using the GAM models; and NIR (mean and entropy) and NDVI using the MARS model (

The performance statistics for each model are summarized in ^{2} values) ranged from 54.2% (MARS, 1/D) to 73.1% (MARS, D), indicating a fairly good predicting ability of the models.

The best model performance was evaluated based on the highest R^{2}, highest adjusted R^{2} and lowest RMSE, RMSE_{r} Bias and Bias_{r} values. In the most cases, the best goodness-of-fit between the predicted and the observed tree diversity index values at the field plots was obtained by the GAM, which had the lowest values for RMSE and Bias and the highest adjusted R^{2}. However, the best fitting was obtained when the tree diversity MARS model was used to predict the Simpson’s index D.

Hyrcanian forests of northern Iran comprise a highly diverse vegetation cover and are increasingly degraded and converted to other land uses. Understanding the main factors that influence the spatial distribution of both local species richness and spatial species turnover is important to adequately map tree diversity. In this study, we assessed the utility of Pleiades satellite image data and two regression techniques for modeling tree diversity in a Hyrcanian Forest. These results are similar to those obtained by other studies aimed at identifying broad patterns of tree species diversity by satellite data (

All the statistical models applied in this study provided fairly successful predictions of forest tree diversity based on remote-sensed data. In particular, GAM and MARS modeling regressions were successfully applied to identify those parts of the study area where tree species richness is above the average. In a comparable study,

The results of this study are not directly comparable with other relevant researches in particular regarding the use of variables derived by texture analysis as predictors. Furthermore, most studies published in the literature used satellite imagery with different spectral/spatial resolutions and/or were conducted in different forest conditions.

Our results showed that Pleiades satellite data and non-parametric regression models could be conveniently used by resource manager to achieve useful indications on tree diversity distribution over large areas in northeastern Iran, as well as to assess and monitor the status of tree diversity of Hyrcanian forests.

A strong limitation faced by conservation biologists and managers of natural resources is the lack of information concerning species distribution patterns. To this purpose, precise biodiversity mapping produced by accurate modeling could help in the selection and effectiveness of protected natural areas.

Location of the study area in the Mazandaran Province, northern Iran (left panel) and distribution of the sample plots in the study area (right panel).

Overview of the predictor variables selected by the tree biodiversity models developed in this study.

Tree DiversityIndex | Modeling technique | Variables selected by the model |
---|---|---|

Simpson’s D | GAM | Mean NIR, Mean Red, Variance NDVI, Contrast NIR |

MARS | Entropy NIR, NDVI, NIR, Mean NIR | |

Shannon’s H′ | GAM | Mean Green, Mean Red, Variance NIR, Contrast NIR |

MARS | Mean NIR, NIR, Dissimilarity Red | |

Simpson’s 1/D | GAM | Mean Red, Variance Green, Variance NIR, Mean Red, |

MARS | Mean NIR, Contrast Red, Entropy NIR |

Descriptive statistics of model and validation samples for indices. (SD): standard deviation.

Tree DiversityIndex | Training Dataset | Validation Dataset | ||||||||
---|---|---|---|---|---|---|---|---|---|---|

N | Mean | Min | Max | SD | N | Mean | Min | Max | SD | |

Simpson’s D | 70 | 0.47 | 0.11 | 0.76 | 0.17 | 30 | 0.57 | 0.13 | 0.76 | 0.12 |

Shannon’s H′ | 70 | 1.28 | 0.12 | 2.56 | 0.47 | 30 | 1.49 | 0.14 | 2.17 | 0.41 |

Simpson’s 1/D | 70 | 2.11 | 1.10 | 4.01 | 0.64 | 30 | 2.50 | 1.24 | 3.94 | 0.63 |

Performance indices of all SI-models for the three tree species and five modelling techniques. (*): best model performance for every evaluation measure.

Tree DiversityIndex | Modelingtechnique | R^{2} |
R^{2}_{adj} |
RMSE | RMSE_{%} |
Bias | Bias% |
---|---|---|---|---|---|---|---|

Simpson’s D | GAM | 0.623 | 0.617 | 0.92 | 20.7 | 0.01 | 2.6 |

MARS* | 0.743 | 0.731 | 0.08 | 18.6 | -0.01 | -2.6 | |

Shannon’s H′ | GAM* | 0.624 | 0.621 | 0.37 | 29.8 | -0.17 | -13.7 |

MARS | 0.563 | 0.542 | 0.5 | 40.32 | -0.22 | -18.2 | |

Simpson’s 1/D | GAM* | 0.653 | 0.646 | 0.30 | 24.19 | -0.14 | -11.2 |

MARS | 0.615 | 0.59 | 0.52 | 41.9 | -0.23 | -18.5 |