I have just finished writing a new tool for estimating viewsheds from digital elevation models (DEMs) and also for calculating a relative visibility index. The algorithms are actually based on earlier tools that I had written into the Terrain Analysis System (TAS). The visibility index estimates the size of the viewshed for each grid cell in a DEM (see figure below).
Fig. 1. Example of the relative visibility index.
These algorithms are actually extremely computationally intensive, as most viewsheding algorithms are, making their application to even moderate-sized DEMs quite challenging on modestly powered computers. So I thought that I would take this opportunity to explore the possibility of using Java’s concurrent features (i.e. parallel processing) to improve computation. This has become increasingly important as a means of improving the performance of computationally intensive spatial analysis algorithms, since most modern computers have multi-core processors. I was able to implement a parallel version of the visibility index tool that provided substantial speed-ups over the serial version.
In doing so, I have come to realize one of the major challenges that we face in the field of spatial analysis. There is a trade-off that exists between using a concurrent (parallel) approach to a spatial analysis algorithm and the size of data that can be processed. On the one hand, using a parallel approach enables the processing of larger datasets in a way that uses all of those available processor cores. However, it would appear that parallel algorithms need the geospatial data being manipulated by a tool to be stored in computer memory, unless I’m missing something obvious. Most of Whitebox’s existing tools have been written in a way that allow for processing very large raster files, by only reading into memory chunks of data that fit comfortably in the existing resources. If the use of concurrent methods requires data to be housed within RAM, we find ourselves once again limited with respect to the size of data that we can process by memory resources. This brings me back to the days of programming TAS in the early 2000’s. Unless the amount of RAM on computers increases substantially, I worry that our ability to develop concurrent spatial analysis algorithms to process massive raster data sets may be somewhat limited. It seems that all those extra cores on our processors are a bit of a mixed blessing and requires considerable re-thinking of some of the standard ways of processing our geospatial data! No doubt, this would make a very interesting avenue for research for some bright young geoscientist…