Abstract:
Soil phosphorus content is one of the important indicators to evaluate soil nutrients. There are few researches on the inversion of soil total phosphorus content using thermal infrared emissivity data, and the conventional linear regression methods are mostly used to establish the model. In this paper, the TASI (Thermal Airborne Hyperspectral Imager) data were collected in the Hailun region of northeast China which were used to explore the relationship between soil emissivity and total phosphorus content in black soil by machine learning, with this result, the best model was selected to predict the total phosphorus content in the soil. The results show that in the range of 8 ~ 11.5 μm, the thermal infrared emissivity of soil increases with the increase of total phosphorus content; Except for the original spectrum 10.792 μm, the correlation coefficients between emissivity and its mathematical transformation and total phosphorus content are all less than 0.5, which shows that the correlation is weak; In the training set and testing set of DNN (Deep Neural Networks), RP performs best in terms of model accuracy and optimization time of PSO(Particle Swarm Optimiazation) algorithm, the determination coefficients
R2 are respectively 0.51 and 0.7, the root mean square error RMSE are 0.0443 and 0.0301, respectively; In the further research, it was found that the change of activation function has very limited influence on the accuracy of the network, specifically, the training set accuracy of the DNN with activation function Tansig and ReLU is basically consistent with the partial least squares and stepwise regression model, and the accuracy of the test set is improved but the stability is relatively worse. The total phosphorus content of the soil is high, which is greater than 0.8 g kg
−1 both in the paddy field and dry field, it can be conclude that the density of towns is negatively correlated with it. Compared with the prediction results of the partial least square model, the neural network model has made more divisions into two intervals of the content: < 0.6 g kg
−1 and 0.6-0.8 g kg
−1, more pixels near the active region are divided into the range, which makes the prediction more consistent with the real distribution. In general, the highest total phosphorus content is concentrated in the west, northwest and southwest of the study area, but it is scattered and irregularly distributed in the east and north. The content is mostly 0.8-1.0 g kg
−1 in the central and other regions.. In summary, the deep neural network model with appropriate parameters has more potential than methods such as partial least square and stepwise regression in the inversion of soil element problems, it can be fully trained to make the prediction results more accurate and stable with sufficient sample data.