Introducing the Open Soil Spectral Library

5 December 2021

On this World Soils Day, the Soil Spectroscopy for the Global Good Coordinated Innovation Network is pleased to announce the release of the Open Soil Spectral Library (OSSL) which is publicly available via https://explorer.soilspectroscopy.org.

Screengrab from explorer.soilspectroscopy.org

Soil Spectroscopy for ALL!

The need for high quality soil data has grown exponentially to support natural resource assessments, sustainable food production, and climate mitigation goals. Soil scientists have been struggling to meet this demand because measurement of the soil still largely relies on shovels and benchtop analytical methods. Reflectance spectroscopy, the measurement of light adsorption at different wavelengths, has emerged as an important rapid and low-cost complement to traditional wet chemical analyses. Numerous research groups and an increasing number of commercial labs are taking advantage of this technology. However, a bottleneck to more widespread adoption of soil spectroscopy is the need to build large reference training datasets and apply complex data analyses. To bridge this gap and enable hundreds of research soil and agronomy groups to get more transparent, more affordable soil data, we have started an Open Source & Open Data project that you can follow from https://github.com/soilspectroscopy.

The OSSL consists of multiple interrelated components. The first is a harmonized database consisting of multiple spectral libraries in both the visible-near infrared (VNIR) and mid infrared (MIR) regions of the electromagnetic spectrum with associated traditionally measured soil properties. These data can be accessed and visualized through the OSSL Explorer or worked with directly via our API or by downloading a snapshot of the entire database. In this first release, we are still working out an efficient method of rapidly and dynamically displaying spectral and analytical data on >110,000 samples, as such there may be some delays in loading the page and subsetting the database.

Distribution of clay content (wt%) across the six databases that comprise the first release of the OSSL.

The second main component is the OSSL Engine, an estimation service where users can upload spectra collected on their own instruments and a set of soil properties will be estimated for each spectra using an ensemble of machine learning models (see OSSL Engine). Models have been developed using only the KSSL MIR data, all MIR data and all VNIR data. In addition, models have been built that also take advantage of spatial information (latitude, longitude and depth) if these data are also provided upon upload. While these initial models show promising results, the next year of this project will focus on refining and localizing predictions so the OSSL Engine can consistently return high quality unbiased predictions.   

Accuracy plot for KSSL MIR prediction of soil organic carbon using 5-fold cross validation with spatial blocking (n = 57,400)

Open and FAIR science is at the heart of the OSSL. We are extremely grateful to the USDA NRCS National Soil Survey Center – Kellogg Soil Survey Laboratory, ICRAF-World Agroforestry, ISRIC-World Soil Information, the Africa Soil Information Service (AfSIS) funded by the Bill and Melinda Gates Foundation, the European Soil Data Centre, the National Ecological Observatory Network (NEON), and ETH Zurich for publishing and providing high quality data that the OSSL can build upon. All of our compiled data can be found via https://github.com/soilspectroscopy under MIT license; a versioned back-up copy of the data is also available via Zenodo under CC-BY license. Both licenses allow you to extend, build-upon, even build commercial businesses on top of this data and code.

Important disclaimer: Please use with care. The OSSL is a work in progress. This first release is intended as a sounding board for feedback from users. The dataset will continue to grow, models will continue to improve and we will keep refining the user experience. Please contribute to this project and help us make better tools for measuring and monitoring our soils and land.

Note: The OSSL Engine is functional if you use a dataset that is fully compatible (this small MIR dataset is good for testing the engine). The upload function should allow flexibility during upload. Please be patient as we work out these final details.

A short Youtube video demonstrating how to use the service can be viewed here.

Screengrab from engine.soilspectroscopy.org showing predictions for pH

These products are provided as pre-beta versions and we would love feedback on how to improve the system as we continue to refine the OSSL. Complete documentation can be found here. Feedback can be submitted as an issue in Github or via email to soilspec4gg@woodwellclimate.org.

Contribute

OSSL is a genuine Open Science project with both Open Source and Open Data licenses. Help us make better models for global good, and especially to support large-scale land restoration / regenerative agriculture projects. If you are the owner of soil spectral calibration libraries, please consider DONATING your data to the project. Your data will be systematically imported into OSSL and used to update global and local soil spectroscopy calibration models; your attribution and citation requests will be carefully followed.

If you are not able to share the data under Open Data license (for various reasons), please consider sharing your data internally with the Soil Spectroscopy for Global Good project / join the numerous working groups. We are open to signing a Data Non-Disclosure agreement that will protect your data and your clients. This way you would still be able to contribute to global good and help us improve global calibration models. Open Reproducible Soil Spectroscopy Calibration models (usually distributed through Docker / Open source software R and/or Python) can be then still distributed under Open Data license without jeopardizing the privacy issues or similar you might have with the original point data.

About

Soil Spectroscopy for the Global Good is a Coordinated Innovation Network funded by USDA NIFA Food and Agriculture Cyberinformatics Tools Program (Award #2020-67021-32467). This project brings together soil scientists, spectroscopists, informaticians, data scientists and software engineers to overcome some of the current bottlenecks preventing wider and more efficient use of soil spectroscopy. For more information: soilspectroscopy.org. This project has many partners, including the FAO Global Soil Partnership-GLOSOLAN who encourages individual efforts to build soil spectral calibration library and estimation services that could serve its global community.

Founding partners: 

Web application development by:

Print Friendly, PDF & Email

Share:

Share on facebook
Share on twitter
Share on linkedin
Jonathan Sanderman

Jonathan Sanderman

Related posts