Author(s): Joseph Mascaro *, Gregory P. Asner, David E. Knapp, Ty Kennedy-Bowdoin, Roberta E. Martin, Christopher Anderson, Mark Higgins, K. Dana Chadwick
Machine learning algorithms are increasingly being applied in image analysis problems ranging from face recognition  to self-driving vehicles . Recently, the Random Forest algorithm , has been used in global tropical forest carbon mapping . However, there is considerable resistance to the use of machine learning algorithms in ecological applications, as the discipline has been the purview of traditional parametric statistics for decades , . The cause for concern is genuine: Random Forest has not often been applied to spatial mapping applications, and there has been limited evaluation of its performance in such applications relative to alternative and more traditional methods. Here we present a side-by-side comparison of Random Forest-based carbon mapping predictions relative to the reliable and often-used approach of stratification-based sampling .
The problem of tropical forest carbon mapping continues to challenge ecologists and remote sensing experts. In practice, measuring the amount of carbon stored in a patch of forest is straightforward, if logistically challenging. Plant biomass may be harvested, dried and weighed , and from this material the carbon fraction determined . However, it is easy to see that such efforts would be futile for determining spatially explicit carbon stock estimates at larger scales. Traditional field campaigns utilize national forest inventory networks-grids of field plots within which tree diameters, heights and wood densities are measured, and allometric models to relate such measurements to estimated carbon stock per tree . But while such networks may be sufficient for estimating total carbon stock in a habitat type, ecoregion or jurisdiction, they are inadequate for estimating spatially explicit carbon stocks. Even immediately adjacent to a particular field plot, an investigator or landowner has much lower predictive power to estimate carbon stock compared to their ability to predict regional totals. Yet, such spatially-explicit carbon estimates are essential for many ecological applications as well as for carbon emissions programs such as the United Nations' Reduced Emissions from Deforestation and Forest Degradation (REDD+) effort .
Remote sensing technologies-and particularly LiDAR (Light Detection and Ranging)-have been used to estimate spatial variation in carbon stocks , -. Whether from airborne or spaceborne platforms, laser scanning technologies can measure aspects of forest structure that are similar to those measured in field plots. For instance, tree height is determined often more accurately with LiDAR than from the ground via traditional techniques such as clinometer trigonometry, particularly in dense, tall-statured tropical forests. Still, while LiDAR measurements offer a possible spatial mapping tool for carbon estimates, they too reach a geographic limit due to cost and logistical considerations . Aircraft cannot yet cover all the world's tropical forests, and spacecraft are limited to a net-like sampling scheme , . Thus, additional data from satellite inputs, such as Landsat, Shuttle Radar Topography Mission (SRTM), Tropical Rainfall Mapping Mission (TRMM), Moderate Resolution Imaging System (MODIS) and other sources are used to scale up LiDAR-based carbon estimates ....