As a faster, reliable, and low cost technique, applicable to large samplings, near infrared (NIR) spectroscopy technology has been widely applied for high-throughput phenotyping in forest breeding programmes. The aim of this study was to develop multivariate models for estimating the chemical and physical properties of juvenile wood based on NIR signatures of milled wood. Moreover, two approaches, namely, external validation by clone and by age, were tested to validate the model for estimating extractive content. NIR spectra of wood specimens taken from three clones of
Notably, Brazil is one of the largest producers and consumers of charcoal in the world (
In addition to growth characteristics, resistance to pests and water stress, as well as several intrinsic characteristics of wood must be considered to select genetic materials with potential to increase productivity and wood quality. According to
The conventional methodology to characterize wood biomass is destructive and requires felling trees to obtain wood specimens, and then time-consuming laboratory procedures are carried out to determine the physical and chemical properties of the wood specimens. Moreover, this method is generally very costly due to the need to transport wood pieces (discs, stems) from forests to the laboratory. Forest breeders demand fast solutions to classify wood quality in large samplings, preferably from standing trees, to select the most promising materials and designate them to breeding programmes.
Near infrared (NIR) spectroscopy has been applied to overcome this challenge. The technique has proven to be an efficient tool in breeding programmes for material selection involving a large number of trees (
Most of the studies involving NIR and wood characterization have been carried out using spectra recorded from solid wood. However, few studies have evaluated the performance of models developed from milled wood, as it can be collected from trees using a handheld drill without harvesting them. Moreover, most predictive models depend on the age of trees. Here, we developed multivariate models based on representative NIR spectra recorded from wood powder of an entire tree to predict the chemical and physical properties of wood. Moreover, we tested two approaches, namely, external validation by clone and by age, to validate the model for extractive estimation. The objective is to validate robust and reliable predictive models for estimating the chemical and physical properties of wood based on NIR signatures taken from milled wood samples regardless of the clone or the age of the tree.
Three
Seven discs were selected from each tree corresponding to 0%, 2%, 10%, 30%, 50%, 70% and 100% of the commercial height of the stem. The commercial height was defined up to a minimum diameter of 4.0 cm with bark. Two opposite wedges (knot free) were removed from discs to determine the basic density, whereas the remaining wedges were milled for chemical analysis of the wood and for recording NIR signatures.
For all chemical analyses, composite wood samples were used, that is, one sample per tree with material from all longitudinal sampling positions. In elementary chemical analyses, oven dry sawdust was selected in overlapping screens of 200 and 270 meshes, using the fraction removed in the latter. In structural chemistry analyses, the fraction retained between the 40 and 60 mesh sieves was used.
NIR spectra were recorded using a Fourier transform spectrometer (MPA, Bruker Optik GmbH, Ettlingen, Germany) in diffuse reflection mode in conjunction with the software program OPUS v. 7.5. Spectral signatures were measured between 12500 and 3600 cm-1 with a resolution of 8 cm-1 using an integrating sphere, but only the 9000 to 4000 cm-1 range was used for analysis. A gold standard was used as a reference, using 16 scans, before spectra were collected in the wood powders.
The discs of each tree were grounded and mixed. Wood powder samples represented an entire tree. Only the fraction retained between 40 and 60 mesh sieves was used for recording NIR spectra. The powder was stored in an appropriate vial, and 16 scans were taken for each spectrum. The mean of two NIR spectra per tree/powder sample was calculated and used for regression analysis. Spectra were recorded in an acclimatized room with a temperature of approximately 20 °C and relative humidity of approximately 65%. Under these conditions, the equilibrium moisture of the wood powders reached approximately 12%.
The spectral data were correlated with the basic density and chemical properties of wood by partial least squares regression (PLS-R) using the software Chemoface v. 1.63 (
Predictive models for basic density, total extractives, ash, holocellulose, S/G ratio and elementary components (nitrogen and carbon) of the wood were developed from NIR spectra measured in the 105 samples. PLS-R models were developed using untreated and treated spectra. The following mathematical treatments were applied to NIR spectra: central averaging, normalization, first derivative (13-point filter and a second-order polynomial), second derivative (25-point filter and a second-order polynomial), multiplicative scatter correction (MSC) and standard normal variate (SNV). Derivatives were calculated from the Savitzky-Golay algorithm.
PLS-R was validated by cross-validation and independent set validation. The leave-one-out method was used for full cross-validations, while independent set validation was performed using 2/3 of the samples chosen at random for calibrations and 1/3 of the remaining specimens for test set validation.
The statistical parameters used to select the best forecasting models were the determination coefficient of cross-validation (R2cv) and independent validation (R2p); mean square error of the cross-validation (RMSEcv) and independent validation (RMSEp); performance ratio for deviation for cross-validation (RPDcv) and independent validation (RPDp) and the number of latent variables used in the calibration (LV). The RPD is the ratio between the standard deviation of the reference values and RMSE. These statistics are a way of identifying the accuracy of the calibration, even between different wood traits.
It is possible to observe noise and a lack of relevant information throughout the range up to 9000 cm-1. Most regions with useful spectral information occur before 7000 cm-1. These regions are attributed to the polymer variations associated with the chemical constitution of wood. According to
The interpretation of the NIR spectrum of wood is complicated because wood is a complex material, mainly composed of cellulose, hemicelluloses, and lignin, along with minor amounts of extractives and inorganics (
Although it was not possible to accurately discriminate clones (
The model developed for the prediction of ash content showed a higher coefficient of determination in cross-validation (R2cv = 0.961) and ratio of performance to deviation (RPD = 5.08). The lowest statistics of the model were presented for estimating the nitrogen content (R2cv = 0.498 and RPD = 1.41). The higher the RPD is, the more robust the model is (
Several studies have reported that wood density can be predicted from NIR signatures (
This kind of predictive model presents many industrial applications since wood density offers considerable information about wood. Wood density is related to several key properties, which highlights that it is a parameter of wood quality applicable in several situations of industrial activity (
This is especially important from an operational point of view. That is, this model is capable of estimating values that can be used to rank unknown materials and classify them in terms of quality from spectra collected from the dust of their wood, without the need to cut down the trees and transport them to the laboratory.
Concerning the chemical composition of wood, NIR spectroscopy detects variations in the chemical constitution and can be used to assess key properties (
When the objective is to produce charcoal from wood, for instance, chemical composition is an aspect affecting industrial performance. In general, wood with a high lignin content and percentage of some extracts is more suitable for energy use (
The model for estimating ash content from first derivative NIR spectra yielded the highest determination coefficient (R2 = 0.961) in cross-validation (Model #3 -
With regard to the predictive models for holocellulose, the previous investigations have shown promising findings. For instance,
Cross-validations for nitrogen content showed the lowest R2cv (0.498), an RMSE of 0.067 and an RPD of 1.41, similar to that of holocellulose (
The model for estimating carbon content (
In summary, the predictive models presented in
These models were rebuilt to test them in independent validations. In other words, the database was randomly divided into sets: one to calibrate (2/3 of samples) and the other (1/3 of samples) to validate the models independently.
In the present study, the PLS-R model for estimating extractive content was selected to be validated by different approaches because the reference data and NIR spectra yielded promising cross-validation results (Model #15 -
Most of the predictive models presented adequate statistics. All PLS-R calibrations presented R2c values greater than 90%. PLS-R models validated using clone 1 (Model #16) and clone 3 (Model #18) presented good statistics, while Model #17 (
The statistics of Models #16 to #18 (
The samples in this study represent trees from three clone varieties slaughtered at ages ranging from 1 to 6 years. Thus, the same clone was evaluated at ages 1, 2, 3, 4, etc.
In this second approach, NIR models were developed based on wood samples of one age range and then were validated using wood samples of the remaining age.
The results obtained in this study show that the sample selection approach is of fundamental importance, as it makes it possible to drastically reduce the number of analyses carried out in the laboratory for the development of NIR calibrations, without loss of precision and covering all the variability found in the dataset to be evaluated.
According to Models #19 to #24 (
PLS-R models were validated by an independent set of wood specimens and presented promising statistics for estimating wood density (R²p = 0.768), extractives (R²p = 0.912), ash (R²p = 0.936) and carbon content (R²p = 0.697) from NIR signatures measured in milled wood of young trees from 1 to 6 years old.
Then, NIR models for estimating the extractive content of wood were developed based on clones or ages and validated using the clones or ages left out. Most of the predictive models presented adequate statistics (R2 greater than 90%) and could be applied to routine laboratory analyses or to select potential trees in
In the outline of this study, our objective was to develop robust and reliable regressions to estimate wood properties based on NIR spectra taken regardless of the clone or the age of the tree. These results showed that it is important to select representative wood samples for developing NIR models. NIR models developed with representative wood samples are able to satisfactorily estimate the extractive contents of unknown wood samples even when tree samples of some age ranges were not included in the calibration set.
These models are able to quickly and reliably generate estimates of key wood properties to rank unknown materials and classify them in terms of wood quality without the need to fell and transport the tree. This approach can be associated with a motor-driven coring system in which wood samples were extracted from standing young trees in the field, as required for wood breeding programmes.
BAL: field collection, data measurement and wood chemical analyses. TGA: review, editing, and results discussion. FMGR: NIRS analyses, result evaluation and paper writing. PFT: funding and resources, methodology and supervision. PRGH: Funding, methodology, data analyses and supervision.
The authors thank the Wood Science and Technology Graduation Program (PPGCTM, UFLA, Brazil) for all the support for this study. This study was financed in part by the
Original NIR spectra of wood samples of
Two-dimensional scatter plots for PC1 and PC2 from principal component analyses (PCAs) of the first derivative NIR spectra grouping samples by Clone (A) and Age (B). Differences in clones or ages are highlighted by the colour scale.
Relationship between wood properties determined in the laboratory and estimated from cross-validation models based on NIR signatures.
Relationship between extractive contents determined in the laboratory and estimated from cross-validation (Model #15) and validations by clone (Models #16-18).
Relationship between extractive contents determined in the laboratory and estimated from PLS-R validations by age (Models #19-24).
Identification of the genetic material used.
Clone | Age (yrs) | City | Location |
---|---|---|---|
1 | 1, 4, 5, 6 | Curvelo | 18°42′ S 44°33′ W |
1 | 2, 3 | Felixlndia | 18°46′ S44°53′ W |
2 | 1 to 6 | Curvelo | 18°42′ S44°33′ W |
3 | 1, 2 | Felixlndia | 18°46′ S44°53′ W |
3 | 3 to 6 | Curvelo | 18°42′ S44°33′ W |
Analysis of wood properties. (H): hydrogen content; (N): nitrogen content; (S): sulfur content; (O): oxygen content.
Analysis | Procedure |
---|---|
Basic density (BD) | NBR 11941 ( |
Nitrogen (N) | ASTM E870-82 ( |
Carbon (C) | C = 100 - H - N - S - O - ASH |
Extractives content (EXT) | TAPPI T280 pm-99 ( |
Syringyl/guaiacyl ratio (S/G) |
|
Ash content (ASH) | NBR 13999 ( |
Holocellulose content (HOLO) | HOLO = 100 - total lignin - EXT - ASH |
Statistics associated with PLS-R calibrations, cross-validations and test set validations for estimating the wood properties of
Method | Y Variable | BD | EXT | ASH | HOLO | S/G | N | C |
---|---|---|---|---|---|---|---|---|
Cross-validation |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
LV | 8 | 10 | 8 | 7 | 8 | 6 | 10 | |
R2c | 0.864 | 0.936 | 0.979 | 0.666 | 0.718 | 0.6 | 0.843 | |
RMSEC | 0.017 | 0.458 | 0.049 | 0.991 | 0.193 | 0.059 | 0.339 | |
R2cv | 0.754 | 0.907 | 0.961 | 0.503 | 0.527 | 0.498 | 0.761 | |
RMSEcv | 0.023 | 0.553 | 0.067 | 1.222 | 0.253 | 0.067 | 0.421 | |
RPDcv | 2.01 | 3.3 | 5.08 | 1.41 | 1.45 | 1.41 | 2.04 | |
Test set validation |
|
|
|
|
|
|
|
|
LV | 10 | 10 | 10 | 5 | 8 | 6 | 10 | |
R2c | 0.862 | 0.929 | 0.966 | 0.564 | 0.605 | 0.607 | 0.833 | |
RMSEC | 0.017 | 0.461 | 0.061 | 1.131 | 0.224 | 0.059 | 0.354 | |
R2p | 0.768 | 0.912 | 0.936 | 0.239 | 0.525 | 0.509 | 0.697 | |
RMSEp | 0.023 | 0.586 | 0.104 | 1.69 | 0.284 | 0.068 | 0.492 | |
RPDp | 2.04 | 3.4 | 3.44 | 1.03 | 1.35 | 1.41 | 1.72 |
Statistics associated with PLS-R test set validations by clone for estimating total extractive contents in
Model | #15 | #16 | #17 | #18 |
---|---|---|---|---|
LV | 5 | 8 | 7 | 8 |
|
|
|||
|
|
|
|
|
Min | 1.49 | 1.49 | 1.49 | 3.55 |
Mean | 5.40 | 5.34 | 5.55 | 5.35 |
Max | 9.26 | 8.87 | 9.26 | 9.26 |
R²c | 0.872 | 0.949 | 0.978 | 0.912 |
RMSEC | 0.541 | 0.348 | 0.267 | 0.347 |
RPDc | 2.81 | 3.39 | 6.79 | 4.51 |
|
|
|
|
|
Min | 1.49 | 3.61 | 3.55 | 1.49 |
Mean | 5.40 | 5.56 | 5.21 | 5.53 |
Max | 9.26 | 9.26 | 7.43 | 8.87 |
R²p | 0.816 | 0.882 | 0.228 | 0.915 |
RMSEp | 0.667 | 0.481 | 0.907 | 0.624 |
RPDp | 2.28 | 2.93 | 1.09 | 3.48 |
Statistics associated with PLS-R test set validations by age for estimating total extractive contents in
Model | #19 | #20 | #21 | #22 | #23 | #24 |
---|---|---|---|---|---|---|
LV | 4 | 2 | 2 | 6 | 7 | 7 |
|
|
|||||
|
|
|
|
|
|
|
Min | 3.36 | 1.49 | 1.49 | 1.49 | 1.49 | 1.49 |
Mean | 5.68 | 5.63 | 5.51 | 5.32 | 5.36 | 5.10 |
Max | 9.26 | 9.26 | 9.26 | 9.26 | 9.26 | 7.57 |
R2c | 0.822 | 0.707 | 0.696 | 0.898 | 0.928 | 0.919 |
RMSEC | 0.567 | 0.820 | 0.870 | 0.502 | 0.419 | 0.368 |
RPDc | 2.39 | 1.86 | 1.83 | 3.16 | 3.79 | 3.56 |
|
|
|
|
|
|
|
Min | 1.49 | 3.36 | 3.55 | 4.25 | 4.08 | 5.34 |
Mean | 3.72 | 4.11 | 4.85 | 5.99 | 5.74 | 7.30 |
Max | 5.37 | 5.01 | 6.39 | 7.57 | 7.47 | 9.26 |
R2p | 0.884 | 0.907 | 0.917 | 0.836 | 0.712 | 0.759 |
RMSEp | 0.807 | 0.483 | 0.2969 | 0.443 | 0.575 | 1.259 |
RPDp | 1.72 | 1.04 | 2.82 | 2.00 | 1.80 | 1.07 |