This site needs JavaScript to work properly. Please enable it to take advantage of the complete set of features!
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log in
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Feb 16;12(2):e0169748.
doi: 10.1371/journal.pone.0169748. eCollection 2017.

SoilGrids250m: Global gridded soil information based on machine learning

Affiliations

SoilGrids250m: Global gridded soil information based on machine learning

Tomislav Hengl et al. PLoS One. .

Abstract

This paper describes the technical development and accuracy assessment of the most recent and improved version of the SoilGrids system at 250m resolution (June 2016 update). SoilGrids provides global predictions for standard numeric soil properties (organic carbon, bulk density, Cation Exchange Capacity (CEC), pH, soil texture fractions and coarse fragments) at seven standard depths (0, 5, 15, 30, 60, 100 and 200 cm), in addition to predictions of depth to bedrock and distribution of soil classes based on the World Reference Base (WRB) and USDA classification systems (ca. 280 raster layers in total). Predictions were based on ca. 150,000 soil profiles used for training and a stack of 158 remote sensing-based soil covariates (primarily derived from MODIS land products, SRTM DEM derivatives, climatic images and global landform and lithology maps), which were used to fit an ensemble of machine learning methods-random forest and gradient boosting and/or multinomial logistic regression-as implemented in the R packages ranger, xgboost, nnet and caret. The results of 10-fold cross-validation show that the ensemble models explain between 56% (coarse fragments) and 83% (pH) of variation with an overall average of 61%. Improvements in the relative accuracy considering the amount of variation explained, in comparison to the previous version of SoilGrids at 1 km spatial resolution, range from 60 to 230%. Improvements can be attributed to: (1) the use of machine learning instead of linear regression, (2) to considerable investments in preparing finer resolution covariate layers and (3) to insertion of additional soil profiles. Further development of SoilGrids could include refinement of methods to incorporate input uncertainties and derivation of posterior probability distributions (per pixel), and further automation of spatial modeling so that soil maps can be generated for potentially hundreds of soil variables. Another area of future research is the development of methods for multiscale merging of SoilGrids predictions with local and/or national gridded soil products (e.g. up to 50 m spatial resolution) so that increasingly more accurate, complete and consistent global soil information can be produced. SoilGrids are available under the Open Data Base License.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: Aleksandar Blagotić is employee and web-developer of GILAB DOO. There are no patents, products in development or marketed products to declare. This does not alter our adherence to all the PLOS ONE policies on sharing data and materials.

Figures

Fig 1
Fig 1. Standard soil depths following the GlobalSoilMap.net specifications and example of numerical integration following the trapezoidal rule.
Fig 2
Fig 2. Example of soil variable-depth curves: Original sampled soil profiles (black rectangles) vs predicted SoilGrids values at seven standard depths (broken red line), and predicted soil organic carbon stock for depth intervals 0–100 and 100–200 cm.
Locations of points from the USDA National Cooperative Soil Survey Soil Characterization database: mineral soil S1991CA055001 (-122.37°W, 38.25°N), and an organic soil profile S2012CA067002 (-121.62°W, 38.13°N).
Fig 3
Fig 3. Input profile data: World distribution of soil profiles used for model fitting (about 150,000 points shown on the map; see acknowledgments for a complete list of data sets used).
Yellow points indicate pseudo-observations. For the majority of points shown on this map, laboratory data can be accessed from ISRIC’s World Soil Information Service (WoSIS) at http://wfs.isric.org/geoserver/wosis/wfs.
Fig 4
Fig 4. Examples of covariates used to generate SoilGrids: TWI is the Topographic Wetness Index (values multiplied by 100), EVI is the MODIS Enhanced Vegetation Index (values multiplied by 10,000), s.d. LST is the long-term standard deviation of MODIS Land Surface Temperatures (values in Celsius degrees).
Location: San Francisco bay area, California. Size of the bounding box is 300 by 300 km.
Fig 5
Fig 5. The (data-driven) statistical framework used for generating SoilGrids.
SoilGrids are primarily based on publicly released soil profile compilations, NASA’s MODIS and SRTM data products and Open Source software compiled with the ATLAS library: R (including contributed packages), and Open Source Geospatial Foundation (OSGeo) supported software tools.
Fig 6
Fig 6. Fitted variable importance plots for target variables.
Generated as an average of predictions using the ranger and xgboost packages (for soil types results are based on the ranger model only). DEPTH.f is depth from soil surface, T**MOD3 and N**MOD3 are mean monthly temperatures daytime and nighttime (red color), TWI, DEM, VBF and VDP are DEM-parameters (bisque color), M**MOD4 are mean monthly MODIS NIR band reflectances (cyan color), P**MRG3 are mean monthly precipitation (blue color), E**MOD5 are mean monthly EVI derivatives (dark green color), VW*MOD1 are monthly MODIS Precipitable Water Vapor images (orange color), C**GLC5 are land cover classes (light green color), and ASSDAC3 is the average soil and sedimentary-deposit thickness (brown color).
Fig 7
Fig 7. Examples of relationships for target variables and the most important covariates: (top row) bulk density in kg m−3, (middle row) soil pH, and (bottom row) soil organic carbon in permilles (on log scale).
Plots show target variables and the top three most important covariates as reported by the random forest model. DEPTH.f is the observed depth from soil surface, T09MOD3 is mean monthly temperature for September, TMDMOD3 is mean annual temperature, PRSMRG3 is total annual precipitation, M04MOD4 is mean monthly MODIS NIR band reflectance for April, P07MRG3 is mean monthly precipitation for July, T01MOD3 is mean monthly temperature for January, and T02MOD3 is mean monthly temperature for February.
Fig 8
Fig 8. Correlation (density) plots produced as a result of 10–fold cross-validation.
See also Table 1 for more details.
Fig 9
Fig 9. Maps of scaled Shannon Entropy index (Eq 5) for USDA and WRB soil classification maps.
Fig 10
Fig 10. Example of scaled Shannon Entropy index for USDA and WRB soil classification maps with a zoom in on USA state Illinois near the city of Chicago.
This figure uses the same legend as used in Fig 9.
Fig 11
Fig 11. List of some remote sensing data of relevance for global soil mapping projects (i.e. with a near to global coverage and with remote sensing technology of interest to soil mapping).
Landsat 8 is part of the Landsat Data Continuity Mission (LDCM) maintained by NASA and the United States Geological Survey (USGS). ALOS Global Digital Surface Model is a product of the Japanese Aerospace Exploration Agency. Sentinel–1,2 is the Earth observation mission developed by the European Space Agency as part of the Copernicus Programme. WorldDEMTM is a commercial product distributed by Airbus Defence and Space.
Fig 12
Fig 12. Comparison between predicted soil pH: (above) SoilGrids (our predictions) for part of California and predictions based on the SSURGO data set (for 0–200 cm depth interval) developed by the National Cooperative Soil Survey, (below) SoilGrids (our predictions) for Tasmania and predictions based on the Soil and Landscape Grid of Australia [76] (for 0–5 cm depth interval).
The correlation coefficients between the two data sources are 0.79 and 0.71, respectively. Crosses on the map indicate soil profiles used for generating SoilGrids.
Fig 13
Fig 13. SoilGrids can be considered the ‘coarsest’ component of the global soil variation ‘signal’ curve.
Other components, e.g. finer products based on local / more detailed 250–100 m resolution imagery, could be added to produce a merged product.
Fig 14
Fig 14. Basic design and functionality of SoilGrids.org: Soil web-mapping browser that provides interactive viewing of 3D soil layers.
Reference administrative data, basic functionality and output data license of SoilGrids.org are primarily based on OpenStreetMap.

References

    1. Scharlemann JPW, Tanner EVJ, Hiederer R, Kapos V. Global soil carbon: understanding and managing the largest terrestrial carbon pool. Carbon Management. 2014;5(1):81–91. 10.4155/cmt.13.77 - DOI
    1. Stockmann U, Padarian J, McBratney A, Minasny B, de Brogniez D, Montanarella L, et al. Global soil organic carbon assessment. Global Food Security. 2015;6:9–16. 10.1016/j.gfs.201507001. - DOI
    1. Aksoy E, Yigini Y, Montanarella L. Combining Soil Databases for Topsoil Organic Carbon Mapping in Europe. PLoS ONE. 2016;11(3):1–17. 10.1371/journal.pone.0152098 - DOI - PMC - PubMed
    1. Shani U, Ben-Gal A, Tripler E, Dudley LM. Plant response to the soil environment: An analytical model integrating yield, water, soil type, and salinity. Water resources research. 2007;43(8). 10.1029/2006WR005313 - DOI
    1. Shepherd KD, Shepherd G, Walsh MG. Land health surveillance and response: A framework for evidence-informed land management. Agricultural Systems. 2015;132:93–106. 10.1016/j.agsy.201409002 - DOI
Cite

AltStyle によって変換されたページ (->オリジナル) /