Announcing the WhiteboxTools library


Things have been moving forward with Whitebox GAT at a fast pace recently. I have had numerous request over the years from people who want to use Whitebox GAT tools from outside of the user interface (UI). While all of Whitebox’s tools can be called from scripts, historically, there has been no way to call a Whitebox tool from outside the UI. Quite some time ago, I committed myself to finding a solution for this problem but efforts have been delayed by a coincident urgent development need. Increasingly the decision to develop Whitebox GAT in Java has been problematic as systems provide ever more marginal support for the Java platform. The solution to these two issues (i.e. external automation of Whitebox tools and the need to decrease reliance on Java) has a convenient synergistic overlap. Today I am publicly announcing the WhiteboxTools library. WhiteboxTools is a sub-project of the Whitebox GAT open-source GIS project. A description below is taken from the WhiteboxTools website:

WhiteboxTools is an advanced geospatial data analysis engine. The library has been developed using the Rust programming language, a very performant and safe systems language often viewed as a modern replacement for C/C++. Although WhiteboxTools is intended to serve as a source of plugin tools for the Whitebox GAT open-source GIS project, the tools contained in the library are stand-alone and can run outside of the larger Whitebox GAT project. See Usage for further details. There have been a large number of requests to call Whitebox GAT tools and functionality from outside of the Whitebox user-interface (e.g. from Python automation scripts). WhiteboxTools is intended to meet these usage requirements. Eventually most of the approximately 400 tools contained within Whitebox GAT will be ported to WhiteboxTools. In addition to separating the processing capabilities and the user-interface (and thereby reducing the reliance on Java), this migration should significantly improve processing efficiency. This is because Rust is generally faster than the equivalent Java code and because many of the WhiteboxTools functions are designed to process data in parallel wherever possible. In contrast, the older Java codebase included largely single-threaded applications.”

At the time of writing, approximately 170 of Whitebox GAT’s plugins have been added to the WhiteboxTools library and more are added every day. WhiteboxTools is already a powerhouse of geospatial analysis in a small 5MB package! The development to date has focused on porting Whitebox’s extensive raster and LiDAR analysis toolset and vector support will eventually be added as well. Distributed under the highly permissive MIT license, WhiteboxTools can be embedded into other software projects to provide geospatial analytical capabilities. The library is currently considered experimental; as the project stabilizes, its functionality will be incorporated into the larger Whitebox GAT project itself. Feedback and contributions are always welcome.


Whitebox GAT Usage

It is difficult to get reliable information about the usage of open-source software like Whitebox GAT. With the recent release of Whitebox 3.4 ‘Montreal’, I decided to undertake an analytical exercise to try and figure out where Whitebox GAT is being used based on some download information that I had accumulating in my inbox. This work updates a previous survey that I carried out a number of years back, shortly after the 1.0 release. Examining nearly 21,000 downloads of the software, I discovered the following:

  • Whitebox GAT has been downloaded in at least 178 counties worldwide, with the top 10 countries, in terms of number of downloads, being the United States, Canada, India, Italy, South Africa, the United Kingdom, Germany, Brazil, Australia, and Spain.
  • Whitebox GAT has been downloaded in 5149 cities around the globe. The map below shows each city in which Whitebox GAT was downloaded at least one time over the past four years. The cities with the greatest number of downloads include: Pretoria (South Africa), Mountain View (US), Johannesburg (South Africa), Toronto (Canada), Chisinau (Moldova), Durham (US), Santiago (Chile), London (Canada), Guelph (Canada), New Delhi (India), Beijing (China), Jakarta (Indonesia), and Athens (Greece).



Cities where Whitebox GAT has been downloaded. Each city has been marked as a red dot.

While the Whitebox Project is not as large scale or fully resourced as some of the bigger open-source GIS projects (e.g. QGIS, GRASS GIS), this survey did demonstrate that it is being used quite widely. Most notably, interest in Whitebox GAT is extensive throughout Europe and North America, but there is healthy sized user communities in many other regions as well. We’re please with how well the project is growing and are always interested in hearing how people are applying the software. If you have an interesting Whitebox GAT success story, let us know.

Whitebox GAT 3.3.0 has been released

I am very pleased to announce that the open-source geographical information system (GIS)and remote sensing package Whitebox GAT version 3.3.0 (codename Glasgow) has now been released. This version was somewhat delayed because of the transition from Google Code to a GitHub based code repository. However, the transition has now been successful and this new version of Whitebox GAT boasts numerous improvements, enhancements, and bug fixes. Please see the release notes for a detailed description of changes.

The Whitebox GAT project ( began in 2009 and was intended to provide a platform for advanced geospatial data analysis with applications in both environmental research and the geomatics industry more broadly. It was envisioned from the outset as providing an ideal platform for experimenting with novel geospatial analysis methods. Equally important is the project’s goal of providing a tool that can be used for geomatics-based education. Its unique open-access development model provides easy opportunities for the user community to inspect, modify, and build upon the source code of Whitebox’s many powerful spatial analysis tools.


Canadians don’t live as far north as you think

You form a picture of how people from other countries view the citizens of your own nation when you travel abroad. In my job as an academic geographer I have had opportunities to meet people from many countries and have come to realize that people often view Canadians as being true northerners, eking out their lives in the cold arctic tundra. I recall a conversation I once had with a German that I met at a conference I attended in Brazil a number of years ago. The topic of wine came up over the course of the dinner and my colleague refused to believe that I wasn’t joking about the fact that Canada produces wines. After some time I realized that it was pointless trying to convince him otherwise. I’m also reminded of the experience of a Canadian colleague who, when at a conference in Iceland, was offered profound sympathy by an Icelander because it must be ‘so brutally cold in Canada, even this time of year’. It was June.

I can understand why many people might view Canadians as being isolated, tundra-living northerners, wearing fur hats year ’round. After all, when you look at a map of the world, it’s difficult to miss the enormous, and largely arctic territories of Canada, second only in the world to that of Russia. Of course, the misrepresentation of high-latitude landmasses in most common map projections doesn’t help this situation any.


The problem is that when you only know a very little about a country and its culture it’s easy to make assumptions based on what you glean from a world map. But there’s a difference between a nation’s territory and its population. While it’s true that there are Canadians that do live at extreme latitudes within the arctic, the vast majority of Canadians live considerably farther south. Oh, and no Canadian lives in an igloo, although we all can build good snow forts and defend them with our accurate snowball throwing skills. The fact is that most Canadians live farther south than the citizens of many European nations. Many Canadians are more likely to visit higher latitudes by traveling to Europe than by visiting Northern Canada (a sad fact perhaps). The image of the Canadian Northerner is so deeply entrenched that even many Canadians would have a hard time believing otherwise. We have taken on the Northerner persona in our own cultural identity. But is it deserved? I decided to look at this a little deeper using the analytical capabilities of Whitebox GAT.

First, take a look at the figure below, which compares the population distributions of various Northern Hemisphere nations by latitude and ranked by their medians (i.e. the latitude below which 50% of a nation’s population lives).


Notice how far down Canada appears? It’s also evident from this figure that Russia, Canada, China, and the US have greater latitudinal population variation than the other nations represented in the analysis; they’re geographically large nations after all. However, in each case, the population distributions of these largest nations are heavily skewed to their southern territories, and none more so than Canada. If we’re a cold-loving winter nation, you wouldn’t know it from the latitudes we choose to settle.

There are several interesting facts about the Canadian latitudinal population distribution that can be gleaned from this analysis:

  1. 90% of Canadians live south of more than 90% of British citizens (bonus fact: London, Canada is nearly 1000 km south of London, UK).
  2. 65% of Canadians live south of 47°N while 65% of French citizens live below a more northerly 48.9°N (bonus fact: Paris, Canada is nearly 630 km south of Paris, France).
  3. About 70% of Canadians live south of the 49th Parallel, which forms the border between Canada and the US west of the Great Lakes (Lake of the Woods, actually).
  4. More than 30% of Canadians live at latitudes overlapping with Spain.
  5. More than a quarter of Canadians live south of the northern most 10% of Chinese citizens.
  6. More than 50% of Canadians live farther south than more than 28% of the Northern Hemisphere’s entire population.

This analysis can be further summarized with the following table, which compares nations by the proportion of their populations living south of various latitudes.

Table 1: The percentage of the populations of various northern hemisphere nations living south of 45.50°N, 49.25°N, and 51.25°N.


South of

45.50°N (%)

South of

49.25°N (%)

South of

51.25°N (%)

Iceland 0.0 0.0 0.0
Norway 0.0 0.0 0.0
Sweden 0.0 0.0 0.0
Denmark 0.0 0.0 0.0
Russia 10.7 17.9 21.0
Ireland 0.0 0.0 0.0
UK 0.0 0.0 9.5
Netherlands 0.0 0.0 4.4
Poland 0.0 0.0 43.2
Germany 0.0 18.6 51.3
Belgium 0.0 0.0 95.9
Ukraine 4.3 54.1 96.4
France 25.2 87.7 100.0
Austria 0.0 100.0 100.0
Hungary 0.0 100.0 100.0
Switzerland 0.0 100.0 100.0
Croatia 47.9 100.0 100.0
Canada 57.2 80.0 91.9
Italy 86.3 100.0 100.0
Spain 100.0 100.0 100.0
US 96.6 99.8 99.8
China 94.3 99.8 100.0

It is apparent that the vast majority of Canadians live farther south than the Nordic nations to which Canada is frequently compared. Given their (broadly) similar size and northerly geographic positioning, Canada is also frequently compared with the Russian Federation, while this analysis demonstrates that Russia populates its northern territories far more than does Canada.

Lastly, ranking countries by their median population latitude, we find that Canada is perhaps much further down the list than one might expect, at number 36.

Table 2: Top 50 countries of the Northern Hemisphere ranked by the latitude of their median population (i.e. the latitude below which half their population live).
Rank Country Median Pop. Latitude (°N)
1 Svalbard and Jan Mayen 78.42
2 Greenland 65.53
3 Iceland 64.18
4 Faroe Islands 62.02
5 Finland 60.99
6 Norway 59.92
7 Estonia 59.38
8 Sweden 59.33
9 Latvia 56.96
10 Denmark 55.66
11 Russia 55.10
12 Lithuania 54.93
13 Belarus 53.91
14 Ireland 53.33
15 United Kingdom 52.47
16 Netherlands 52.09
17 Poland 51.76
18 Germany 51.22
19 Belgium 50.85
20 Czech Republic 49.91
21 Luxembourg 49.61
22 Ukraine 49.22
23 Slovakia 48.72
24 France 48.53
25 Mongolia 48.38
26 Austria 48.18
27 Hungary 47.50
28 Kazakhstan 47.24
29 Switzerland 47.19
30 Liechtenstein 47.17
31 Republic of Moldova 47.06
32 Saint Pierre and Miquelon 46.96
33 Slovenia 46.07
34 Croatia 45.63
35 Romania 45.58
36 Canada 45.51
37 Serbia 44.68
38 Bosnia and Herzegovina 44.20
39 San Marino 43.95
40 Monaco 43.73
41 Italy 42.92
42 Montenegro 42.69
43 Bulgaria 42.68
44 Andorra 42.52
45 Kyrgyzstan 42.51
46 Macedonia 41.92
47 Georgia 41.73
48 Albania 41.33
49 Uzbekistan 41.25
50 Azerbaijan 40.42

Does this mean that Canadians don’t experience harsh winters, as is often portrayed? Well, the fact is Canada is a very large nation and it’s hard to make sweeping generalizations about something as vaguely defined and variable as ‘the Canadian Winter’. Certainly there are places and times when Canadians experience extreme winter conditions (that’s how we develop those world-class snow fort building skills). At the time that I’m writing this (late February 2016), it is -24 degrees Celsius in Yellowknife, NT (62.4°N) with a snow pack of well over a metre, while where I live in Southern Ontario (43.6°N) it is currently +4 degrees Celsius and there isn’t any snow in my front yard at all. Last year at this same time it felt more like Yellowknife in Toronto. Variation is the norm. For many Canadians, the dominant factor affecting our weather is more the extreme continentality (distance to ocean) rather than the latitude. This is the same reason why we usually experience very hot summers, a fact that people outside of Canada frequently don’t realize as well!

A note on methods and limitations:

The data that I used to complete this analysis was taken from the Free World Cities Database provided by MaxMind. The database provides the latitude and longitude coordinates of world cities, along with their countries and populations. The accuracy of population estimates varies by country, but is expected to be high for the nations used in this comparison. The smallest Canadian settlement recorded in the database had 540 people (Mayo, Yukon), which is indicative the level of detail in the data.

The use of the World Cities data set also implies that rural populations are not accounted for in the analysis. While this is a limitation of the method, the fact that large urban settlements tend to grow within the richest agricultural regions suggests that the population distribution depicted by the data set should correlate well with the overall urban/rural population of nations.

The analysis was carried out using Whitebox Geospatial Analysis Tools. I wrote using script in Whitebox’s Scripter using the Groovy programming language. This short script digested the MaxMind database and calculated the cumulative frequency distribution of each country’s population by latitude and output a report table for each of the target countries used in the comparison. It was quite a fun way to procrastinate while marking mid-term exams and a good way to show off some of Whitebox’s capabilities for data science.

Whitebox GAT’s new website and other developments

There are a few exciting announcements related to new developments on the Whitebox GAT front. The first is that due to changes in the Google Code practices (it has become ‘read only’ and is no longer accepting new code commits), the Whitebox project has moved its source code repository to GitHub. I think that this will eventually make for improved source code management, although there may be some initial transition issues that we’ll need to work past. Some of the documentation will have to be updated to reflect this change.

The second announcement, which I am most excited about, is that I have finally found the time to update the Whitebox GAT website. There is a fresh new and more professional look to the site. I hope you enjoy the new webpage and as always, feedback is welcome. What would you like to see added or changed?

Lastly, Whitebox GAT just got a new little brother called GoSpatial. GoSpatial is a command-line interface program for analyzing and manipulating geospatial data. It has been developed using the Go programming language and is compiled to native code. The project is experimental and is intended to provide additional analytical support for the Whitebox Geospatial Analysis Tools open-source GIS software. GoSpatial can however be run completely independent of any other software and is run from a single self-contained executable file.

The GoSpatial geospatial analysis command line tool.

The GoSpatial geospatial analysis command line tool.

Isn’t it cute?

Modelling the spatial pattern of potential impoundment size from DEMs

For much of my research career I’ve been deeply interested in the techniques that are used to model surface and near-surface drainage patterns using DEMs. Much of this work has focused on DEM processing techniques for handling sinks, such as topographic depressions. Removal of depressions, particularly artifact depressions, from DEMs is a necessary step for most flow-path based hydrological analysis and there is a great deal of research dedicated to developing and testing the methods for this type of flow enforcement operation.

For a long time now, however, I’ve been thinking about the opposite problem. Instead of removing depressions from DEMs, how can we add depressions? In fact, artificially damming a site in a DEM is fairly straight forward and simply involves raising the elevations along a dam site. To me, the more interesting problem is how to locate sites in a DEM that, were a dam or dyke of a specified size inserted, would produce an impoundment (i.e. the artificial body of water behind the dam) of a certain size. In other words, can we measure the impoundment size for each grid cell in a DEM based on specified maximum dam/dyke height and length? It turns out that this is a fairly tricky problem which is why I’ve been interested in it for a couple of years now. An impoundment size raster would be very useful for all sorts of applications. For example, it could be used for siting potential wetland restoration projects, monitoring wetland drainage at the landscape scale (presumably using LiDAR DEMs), modelling beaver habitat (or other related aquatic species), siting tailings facilities, hydroelectric dam siting, and many other applications.

Well I’ve finally figured it out. I present to you Whitebox GAT’s new Impoundment Index tool:

Impoundment Index Tool

Impoundment Index Tool

I feel like the image above should have been slowly revealed with an increasing-tempo drum-line and trumpeted crescendo in the background but sadly these are the limitations of the media. Perhaps you could go to the top of your screen again and slowly scroll downward while humming.

It only took two years to ruminate over this little problem and, once I figured it out, an afternoon to code. In fairness, I’ve had a lot of other things to keep me distracted in that time, like developing an entire GIS for example. For a specified input DEM, the Impoundment Index tool will calculate the size, either the area or the volume, of the impoundment the would result from inserting a dam of a specified maximum height and length. This is a sample of what the output raster looks like:

Sample impoundment index raster

Sample impoundment index raster

As you might expect, the impoundment size is highest in sites with locally convergent topography and extensive relatively low-lying upslope areas. In this case, each grid cell in the raster contains the estimated area in square metres of the impoundment that would result from inserting a 5 m (maximum) high 11 grid cell long dam.

There are two parts to the algorithm. The first component measures the height of a potential dam for each grid cell based on the topographic profile perpendicular to the flow line at the site. The second component to the algorithm performs a type of flow accumulation operation. Where most accumulation operations of this type propagate a single value (e.g. the area upslope) along flow-paths, this algorithm propagates an entire elevation distribution. In that way, it is possible to measure, for each grid cell, the number of grid cells in its catchment that are less than the previously measured maximum dam crest elevation. The really tricky part is how it manages to propagate the elevation distributions to each cell in a computationally and memory efficient manner such that your computer doesn’t explode!

The Impoundment Index tool can even output a DEM where grid cells with an impoundment size within a specified target range have their dams inserted, allowing you to model the potential damming sites:

Modelled impoundments

Modelled impoundments

The tool is still experimental but will hopefully be officially released in Whitebox GAT v. 3.2.3 or whatever the next update becomes. If you can’t wait to play with the Impoundment Index tool, it’s already in the code repository. So you can easily start messing around with it by selecting ‘Update Scripts From Repository’ under Whitebox’s Tools menu. After updating and restarting Whitebox, you’ll find the new tool within the Wetlands Tools and Flowpath Terrain Attributes toolboxes. I’d like to write a paper describing exactly how the tool works in detail in the hopes that other GIS packages will also implement this useful method. Maybe you’ll enjoy this new and exciting function as much as I’ve enjoyed the process of creating it. As always, best wishes and happy geoprocessing.

Whitebox Geospatial Analysis Tools v. 3.2.2 Released

It is with tremendous pleasure that I announce the release of the latest version of Whitebox Geospatial Analysis Tools, version 3.2.2. It has been several months since the release of the previous version of Whitebox GAT and we have incorporated several significant new features, tools, and bug fixes. The following is a partial list of changes:

  • LAS files can now be rendered based on the Classification, Scan Angle, and GPS Time in addition to Elevation and Intensity. This is on top of several other improvements that have been made to the display of LAS file data in Whitebox GAT.
  • Added Conditional Evaluation tool (if-then-else statement). I really like this tool and now wonder how it was that I managed to get by without it.
  • Added a PickFromList tool, which outputs a raster with the value from a stack
    of rasters based on an input position raster.
  • Added LowestPosition, HighestPosition, PercentEqualTo, PercentGreaterThan, and PercentLessThan tools for working with raster stacks, i.e. lists of overlapping
    raster images.
  • Added a tool to create histograms based on the elevations, intensity, and
    scan angles within a LiDAR (LAS) file. It will also output the percentiles of
    the distribution, e.g. 95th percentile of elevation.
  • Added the AddPointCoordinateToTable tool, which can be used to add the x-y
    coordinates of points within a Point type ShapeFile as fields within its attribute table.
  • Added a tool to filter the points in a LiDAR (LAS) file based on a threshold
    in the absolute scan angle. Currently the output is a shapefile of mass-point shape but eventually we would like to have it write to a new LAS file.
  • The Merge Points Files tool was replaced with the more general Merge Shapefiles
    tool, which works with any ShapeType.
  • Added the FindLowestHighestLocations tool, which will output vector points
    corresponding to the lowest and highest points in a raster. This has already come in handy several times.
  • Added ExtractRasterValuesAtPoints tool for extracting the cell values of each
    image in a raster stack corresponding to a set of vector points. Values are
    placed within the vector attribute table.
  • Added DeleteSmallLakesAndExtendRivers tool, which can be used to remove small lakes (polygons) in a vector drainage network and to extend the associated river
    network (intersecting polylines) into the interior of the lake. I created this tool in response to an interesting question asked over on the GIS Stack Exchange.
  • Added a Long Profile From Point tool, which can generate one or more longitudinal
    profiles for the flowpaths issuing from a set of vector points.
  • Modified the Mosaic With Feathering tool to handle RGB images in addition to
    continuous scale rasters. At the moment, this only works for the nearest-
    neighbour mode. I’m not sure why I didn’t do this earlier.
  • Added Image Stack Profile tool to create a line graphs for a set of vector points
    through an image stack, such as multi-spectral satellite image data. This can be handy for visualizing the spectral signatures of individual pixels.
  • Added a Simple Region Grow tool that will perform a very simple region grow
    segmentation of pixels in an image stack based on a specified threshold. I’d like to continue development in this area and eventually include a full object-based image segmentation.
  • Added parallel implementations of the D8 flow pointer and accumulation algorithms. At the moment this is really an experimental tool that is not intended for widespread use but there is more to come, including parallel versions of all the flow accumulation algorithms.
  • Fixed a bug in the Hillslopes tool.
  • I’ve added a whole suite of tools to the Elevation Residuals toolbox for
    performing multi-scaled topographic position analysis. This includes modified
    tools for calculating difference and deviation from mean elevations using an
    integral image approach that is extremely computationally efficient, even with
    large search windows. It also includes the Local Topographic Position Scale
    Signature and the Maximum Elevation Deviation tools. Combined with the
    ‘customRGBScript.groovy’ script, these functions allow for the creation of
    some spectacular multi-scale topographic position visualizations. See this link
    for examples.

And like a proud father, I can’t resist showing some nice pictures…

Whitebox GAT 3.2.2 screenshot

Whitebox GAT 3.2.2 screenshot

Whitebox GAT 3.2.2 screenshot

Whitebox GAT 3.2.2 screenshot

Whitebox GAT v. 3.2.2

Whitebox GAT v. 3.2.2

One point of note is that Oracle stopped supporting Java on Windows XP some time ago and therefore recent versions of Whitebox GAT no longer function on this platform. It’s time to upgrade!

Please let me know if you have any feedback or questions regarding the new version of Whitebox GAT and I hope you enjoy all the new goodies. As always, best wishes and happy geoprocessing.

UPDATE (April 15, 2015): It would seem that there were some issues with some of the plugin tools written as Groovy scripts that resulted from breaking changes associated with the update from Groovy 2.3 to Groovy 2.4.1. To overcome these issues I regressed the Groovy library linked to by Whitebox GAT to the 2.3 version and now all of the affected tools are working properly. Thank you to the users that alerted me to this issue.

The language of sinks, topographic depressions, and pits…

Over the last several decades, there has been a tremendous confusion over the terminology used to describe the features that are involved in flow-enforcement of digital elevation models (DEMs) applied in surface drainage pattern modelling. In particular, the terms sink, depression, and pit have been used interchangeably and this has caused, in my opinion, a great deal of confusion in both the academic literature and practice. In a manuscript that I am currently drafting, I developed a typology of features involved in the flow-enforcement step, a necessary first-step for any process involving hydrological analysis of topographic data. The following flow-chart describes the relation between these overlapping concepts in what I think is a consistent manner.

A typology of features involved in flow-enforcement

A typology of features involved in flow-enforcement

Multi-scale topographic position visualizations

I thought that people would enjoy this beautiful map that I have been working on this holiday season. It is a visualization derived from an SRTM digital elevation model based on a multi-scale topographic position analysis. This is one that you really have to click to enlarge to fully appreciate. I have spent hours lost in the detailed galactic colouring of this map!

This beautiful map of eastern Canada and the US was made with Whitebox GAT’s newest multi-scale topographic position tools. (click on image to enlarge)

This beautiful map of eastern Canada and the US was made with Whitebox GAT’s newest multi-scale topographic position tools. (click on image to enlarge)

I’m working on a paper right now in which I describe a new form of local topographic position analysis and the map above is one of the resulting visualizations. It shows prominent features at the small (blue channel), medium (green channel), and large (red channel) scales. A prominent feature is one that is either significantly above or below the surrounding landscape at the scale of interest. There’s actually a fair amount of analysis (and coding!) involved but if you’re really interested and can’t wait for me to finish the paper, send me an email. Here is a similar map but covering parts of British Columbia, Canada:

Mulit-scale topographic position for British Columbia, Canada (click to enlarge)

Mulit-scale topographic position for British Columbia, Canada (click to enlarge)

And this is the multi-scale topographic position map derived from SRTM data for the Northern Territory of Australia:

Topographic position map of the Northern Territory, Australia (click to enlarge)

Topographic position map of the Northern Territory, Australia (click to enlarge)

Personally, I think that these visualizations are remarkable for their ability to characterize the structure of the surface geology of a region but I’m sure that there are many other interesting applications as well. Interpreting the images takes a bit of experience but the following interpretation key can help:

Interpretation Key

Interpretation Key

Leave your comments below and, as always, best wishes and happy geoprocessing.

****UPDATE (May 27, 2015)****

This work was eventually written up as a manuscript and has recently been accepted for publication by the journal Geomorphology. The citation is:

Lindsay, J.B., Cockburn, J.M.H., and Russell, H.A.J. In press. ‘An integral image approach to performing multi-scale topographic position analysis’ Geomorphology, 245, DOI: 10.1016/j.geomorph.2015.05.025.

This article can be downloaded for free until July 18, 2015 from the following link:,3sl3TsZi

And the submitted draft is available here.

Parallelization of GIS Operations in Whitebox GAT

I often get feedback from new users of Whitebox Geospatial Analysis Tools about how surprisingly fast it is for whatever task they are doing. However whenever I develop a new tool or function for Whitebox, I always take algorithm performance into consideration. A big part of that consideration these days is the potential for parallelizing all or parts of an operation to take advantage of the of multi-core processors that have become ubiquitous. There are many geoprocessing tools in Whitebox GAT that take advantage of concurrency, e.g the Clip tool, the various interpolation tools for LiDAR, and several others.

I try to take a balanced approach with respect to parallelization. Not every operation will necessarily benefit from parallelization (some will actually become slower) and some operations are simply unparallelizable, while many tools will only benefit when a certain bottleneck in the processing workflow are calculated concurrently. Another major consideration when I develop a new tool or function is managing memory requirements. People keep talking about Big Data these days like it’s a new phenomenon, but we in the geomatics community have been dealing with massive datasets for as long as there’s been a community. Often parallelizing an operation is possible but adds significant memory requirements that would make the parallelized version only applicable to smaller datasets that can easily fit in system memory. When this is the case, I’ll sometimes provide an option to the user to choose a parallelized or memory-optimized version. I must always consider the scalability of a tool during design.

There is one common scenario when you can really benefit from parallelizing your workflow. Have you ever found yourself in a situation where you have dozens, or maybe even hundreds of files all of which need to have the same operation applied? Perhaps you have hundreds of LiDAR tiles and need to apply a depression removal operation on them all. Maybe you have dozens of satellite images that need to have the same convolution filter applied. It’s a fairly common situation for a GIS tech to find themselves in. Of course you could always write a quick Python, Javascript or Groovy script to automate the process and save you the tedium of having to perform each operation manually. But unless you dedicate some time to writing that script just right, it’s likely going to run sequentially. All of those extra processing cores on your machine are just going to idle away. I have a machine with 8 cores and boy does it bother me when they’re not all in use at all times! Well Whitebox has an answer for this particular situation and it’s called Run Plugin In Parallel.

Run Plugin In Parallel tool

Run Plugin In Parallel tool (click to enlarge)

With this tool you can call any of Whitebox’s 400+ plugin tools in a type of parallel batch mode. Each call of the tool will run on it’s own a separate thread. This can significantly reduce processing time, particularly when you have a four (or even eight) core system. Even custom plugin tools that you have developed will be available for targeting with the Run Plugin In Parallel tool. Using the tool is fairly straightforward. You have to specify the name of the plugin tool that you are running. Here it is important to remember that it is the proper plugin name and not the descriptive name that you may see displayed in the tools listings. Normally the proper plugin name is the same or similar to the descriptive name but without spaces. If in doubt, you can always look at the source code. The second input parameter is a text file. Each line within the text file provides the parameters that are supplied to the tool for one run. The input parameters are going to be specific to the tool and are the same that are used when you call a tool from a script. The help documentation for each tool has a Scripting section that describes the input parameters required to run the tool from a script. The following is an example of a text file that could be used for to run a parallelized batch mode of the FD8 Flow Accumulation tool (FlowAccumulationFD8):

/Documents/Data/DEM1.dep, /Documents/Data/FlowAccum1.dep, 1, specific catchment area (sca), true, not specified
/Documents/Data/DEM2.dep, /Documents/Data/FlowAccum2.dep, 1, specific catchment area (sca), true, not specified
/Documents/Data/DEM3.dep, /Documents/Data/FlowAccum3.dep, 1, specific catchment area (sca), true, not specified
/Documents/Data/DEM4.dep, /Documents/Data/FlowAccum4.dep, 1, specific catchment area (sca), true, not specified

Notice that each line should contain entries for each of the input parameters for the tool as specified by the help documentation, which in this case include:

demFile, outputFile, exponent, outputType, logTransformOutput, threshold

All of the parameters required for one run must be contained on the same line. The above example can be used to run the FD8 Flow Accumulation four times, each run on a separate thread, with four different inputs and four outputs. It will effectively quarter the time required to process the data on a typical quad-core system. But the real benefit occurs when you have many more files that need processing. If you have a more complex workflow involving more than one operation, you can run this parallel batch mode for each step (i.e. process all of the depression filling operations and then process all of the flow accumulation operations). You can even call the Run Plugin In Parallel tool from a script to automate these types of operations. If you’re working at that level, I think you’ve earned the right to call yourself a GIS ninja! Leave your comments below and, as always, best wishes and happy geoprocessing.