Dear soil spectroscopy community,
After several months of testing and development work, we are happy to share our latest update of the Open Soil Spectral Library (OSSL) and the additional features built around it.
We have updated the OSSL to version 1.2, which now includes four new spectral libraries: i) Scion Research MIR library shared by Loretta Garrett, which contains a few hundred MIR spectra of New Zealand forest soils; ii) University of Zurich permafrost MIR library, shared by Marcus Schiedung, which contains more than two hundred MIR spectra of permafrost soils from Canada; iii) the Serbian soil spectral library from the University of Novi Sad, shared by Branislav Jović, which contains more than one hundred MIR spectra of Serbian soils from different regions and soil types; and iv) the Neospectra Handheld NIR library, which contains more than two thousand near-infrared (NIR) spectra of soils from the US, Ghana, Kenya, and Nigeria, measured using the NeoSpectra Handheld NIR Analyzer developed by Si-Ware. We are grateful to our current OSSL contributors and hope that more people get involved in our initiative and share their soil spectral libraries.
Similarly, we have received important feedback since the release of the first models and web services (Dec. 2021). The feedback was that the previous model versions were doing a great job but for others not that much. Variable model performance may have happened due to the specific soil types not being well represented or due to the spectra not being well aligned due to potential instrument dissimilarities. This led us to improve the current outputs to include a flag that indicates if the new samples to be predicted are represented by the calibration set. Moreover, we revised our uncertainty estimation method by switching to conformal prediction, a simple and robust method for delivering uncertainty bands.
In addition to that, we conducted a systematic analysis of learning algorithms, compression strategies, and preprocessing using the OSSL database and external test sets, and the insights from the ring trial experiment – a separate project that was developed to understand the dissimilarity across multiple soil spectroscopy laboratories. We also removed spatial covariates and not fused spectral regions (i.e. VisNIR+MIR) compared to the first models, in order to make choices simpler and more generally applicable. These further combinations require more time to verify the potential performance improvements, although recent literature has been supporting it. The current models now are solely calibrated with the separate spectral regions (VisNIR, NIR, and MIR) using the Cubist algorithm, which stood out in the ring trial experiment and in other internal benchmark experiments.
Lastly, the OSSL manual was updated and now includes further information about the OSSL database compilation, access options (files stored on storage buckets, APIs, MongoDB, etc.), modeling framework, and web applications. We believe that these features can enhance the usability and trustworthiness of our database and models and make them more suitable for different applications and scenarios. We are currently working on a paper to describe the details and results of our work development process and to demonstrate the benefits of an open-access and open-source initiative.
We encourage you to test the newly updated prediction engine with your own MIR, VisNIR, and NIR (Neospectra Scanner) data sets. If you do use the service, please let us know what you think about the user experience and the model performance.
Onwards,
the Soil Spectroscopy for the Global Good team
soilspec4gg@woodwellclimate.org