Whitebox Geospatial Analysis Tools v. 3.2.2 Released

It is with tremendous pleasure that I announce the release of the latest version of Whitebox Geospatial Analysis Tools, version 3.2.2. It has been several months since the release of the previous version of Whitebox GAT and we have incorporated several significant new features, tools, and bug fixes. The following is a partial list of changes:

  • LAS files can now be rendered based on the Classification, Scan Angle, and GPS Time in addition to Elevation and Intensity. This is on top of several other improvements that have been made to the display of LAS file data in Whitebox GAT.
  • Added Conditional Evaluation tool (if-then-else statement). I really like this tool and now wonder how it was that I managed to get by without it.
  • Added a PickFromList tool, which outputs a raster with the value from a stack
    of rasters based on an input position raster.
  • Added LowestPosition, HighestPosition, PercentEqualTo, PercentGreaterThan, and PercentLessThan tools for working with raster stacks, i.e. lists of overlapping
    raster images.
  • Added a tool to create histograms based on the elevations, intensity, and
    scan angles within a LiDAR (LAS) file. It will also output the percentiles of
    the distribution, e.g. 95th percentile of elevation.
  • Added the AddPointCoordinateToTable tool, which can be used to add the x-y
    coordinates of points within a Point type ShapeFile as fields within its attribute table.
  • Added a tool to filter the points in a LiDAR (LAS) file based on a threshold
    in the absolute scan angle. Currently the output is a shapefile of mass-point shape but eventually we would like to have it write to a new LAS file.
  • The Merge Points Files tool was replaced with the more general Merge Shapefiles
    tool, which works with any ShapeType.
  • Added the FindLowestHighestLocations tool, which will output vector points
    corresponding to the lowest and highest points in a raster. This has already come in handy several times.
  • Added ExtractRasterValuesAtPoints tool for extracting the cell values of each
    image in a raster stack corresponding to a set of vector points. Values are
    placed within the vector attribute table.
  • Added DeleteSmallLakesAndExtendRivers tool, which can be used to remove small lakes (polygons) in a vector drainage network and to extend the associated river
    network (intersecting polylines) into the interior of the lake. I created this tool in response to an interesting question asked over on the GIS Stack Exchange.
  • Added a Long Profile From Point tool, which can generate one or more longitudinal
    profiles for the flowpaths issuing from a set of vector points.
  • Modified the Mosaic With Feathering tool to handle RGB images in addition to
    continuous scale rasters. At the moment, this only works for the nearest-
    neighbour mode. I’m not sure why I didn’t do this earlier.
  • Added Image Stack Profile tool to create a line graphs for a set of vector points
    through an image stack, such as multi-spectral satellite image data. This can be handy for visualizing the spectral signatures of individual pixels.
  • Added a Simple Region Grow tool that will perform a very simple region grow
    segmentation of pixels in an image stack based on a specified threshold. I’d like to continue development in this area and eventually include a full object-based image segmentation.
  • Added parallel implementations of the D8 flow pointer and accumulation algorithms. At the moment this is really an experimental tool that is not intended for widespread use but there is more to come, including parallel versions of all the flow accumulation algorithms.
  • Fixed a bug in the Hillslopes tool.
  • I’ve added a whole suite of tools to the Elevation Residuals toolbox for
    performing multi-scaled topographic position analysis. This includes modified
    tools for calculating difference and deviation from mean elevations using an
    integral image approach that is extremely computationally efficient, even with
    large search windows. It also includes the Local Topographic Position Scale
    Signature and the Maximum Elevation Deviation tools. Combined with the
    ‘customRGBScript.groovy’ script, these functions allow for the creation of
    some spectacular multi-scale topographic position visualizations. See this link
    for examples.

And like a proud father, I can’t resist showing some nice pictures…

Whitebox GAT 3.2.2 screenshot

Whitebox GAT 3.2.2 screenshot

Whitebox GAT 3.2.2 screenshot

Whitebox GAT 3.2.2 screenshot

Whitebox GAT v. 3.2.2

Whitebox GAT v. 3.2.2

One point of note is that Oracle stopped supporting Java on Windows XP some time ago and therefore recent versions of Whitebox GAT no longer function on this platform. It’s time to upgrade!

Please let me know if you have any feedback or questions regarding the new version of Whitebox GAT and I hope you enjoy all the new goodies. As always, best wishes and happy geoprocessing.

UPDATE (April 15, 2015): It would seem that there were some issues with some of the plugin tools written as Groovy scripts that resulted from breaking changes associated with the update from Groovy 2.3 to Groovy 2.4.1. To overcome these issues I regressed the Groovy library linked to by Whitebox GAT to the 2.3 version and now all of the affected tools are working properly. Thank you to the users that alerted me to this issue.

Advertisements

Multi-scale topographic position visualizations

I thought that people would enjoy this beautiful map that I have been working on this holiday season. It is a visualization derived from an SRTM digital elevation model based on a multi-scale topographic position analysis. This is one that you really have to click to enlarge to fully appreciate. I have spent hours lost in the detailed galactic colouring of this map!

This beautiful map of eastern Canada and the US was made with Whitebox GAT’s newest multi-scale topographic position tools. (click on image to enlarge)

This beautiful map of eastern Canada and the US was made with Whitebox GAT’s newest multi-scale topographic position tools. (click on image to enlarge)

I’m working on a paper right now in which I describe a new form of local topographic position analysis and the map above is one of the resulting visualizations. It shows prominent features at the small (blue channel), medium (green channel), and large (red channel) scales. A prominent feature is one that is either significantly above or below the surrounding landscape at the scale of interest. There’s actually a fair amount of analysis (and coding!) involved but if you’re really interested and can’t wait for me to finish the paper, send me an email. Here is a similar map but covering parts of British Columbia, Canada:

Mulit-scale topographic position for British Columbia, Canada (click to enlarge)

Mulit-scale topographic position for British Columbia, Canada (click to enlarge)

And this is the multi-scale topographic position map derived from SRTM data for the Northern Territory of Australia:

Topographic position map of the Northern Territory, Australia (click to enlarge)

Topographic position map of the Northern Territory, Australia (click to enlarge)

Personally, I think that these visualizations are remarkable for their ability to characterize the structure of the surface geology of a region but I’m sure that there are many other interesting applications as well. Interpreting the images takes a bit of experience but the following interpretation key can help:

Interpretation Key

Interpretation Key

Leave your comments below and, as always, best wishes and happy geoprocessing.

****UPDATE (May 27, 2015)****

This work was eventually written up as a manuscript and has recently been accepted for publication by the journal Geomorphology. The citation is:

Lindsay, J.B., Cockburn, J.M.H., and Russell, H.A.J. In press. ‘An integral image approach to performing multi-scale topographic position analysis’ Geomorphology, 245, DOI: 10.1016/j.geomorph.2015.05.025.

This article can be downloaded for free until July 18, 2015 from the following link:

http://authors.elsevier.com/a/1R6Uf,3sl3TsZi

And the submitted draft is available here.

Parallelization of GIS Operations in Whitebox GAT

I often get feedback from new users of Whitebox Geospatial Analysis Tools about how surprisingly fast it is for whatever task they are doing. However whenever I develop a new tool or function for Whitebox, I always take algorithm performance into consideration. A big part of that consideration these days is the potential for parallelizing all or parts of an operation to take advantage of the of multi-core processors that have become ubiquitous. There are many geoprocessing tools in Whitebox GAT that take advantage of concurrency, e.g the Clip tool, the various interpolation tools for LiDAR, and several others.

I try to take a balanced approach with respect to parallelization. Not every operation will necessarily benefit from parallelization (some will actually become slower) and some operations are simply unparallelizable, while many tools will only benefit when a certain bottleneck in the processing workflow are calculated concurrently. Another major consideration when I develop a new tool or function is managing memory requirements. People keep talking about Big Data these days like it’s a new phenomenon, but we in the geomatics community have been dealing with massive datasets for as long as there’s been a community. Often parallelizing an operation is possible but adds significant memory requirements that would make the parallelized version only applicable to smaller datasets that can easily fit in system memory. When this is the case, I’ll sometimes provide an option to the user to choose a parallelized or memory-optimized version. I must always consider the scalability of a tool during design.

There is one common scenario when you can really benefit from parallelizing your workflow. Have you ever found yourself in a situation where you have dozens, or maybe even hundreds of files all of which need to have the same operation applied? Perhaps you have hundreds of LiDAR tiles and need to apply a depression removal operation on them all. Maybe you have dozens of satellite images that need to have the same convolution filter applied. It’s a fairly common situation for a GIS tech to find themselves in. Of course you could always write a quick Python, Javascript or Groovy script to automate the process and save you the tedium of having to perform each operation manually. But unless you dedicate some time to writing that script just right, it’s likely going to run sequentially. All of those extra processing cores on your machine are just going to idle away. I have a machine with 8 cores and boy does it bother me when they’re not all in use at all times! Well Whitebox has an answer for this particular situation and it’s called Run Plugin In Parallel.

Run Plugin In Parallel tool

Run Plugin In Parallel tool (click to enlarge)

With this tool you can call any of Whitebox’s 400+ plugin tools in a type of parallel batch mode. Each call of the tool will run on it’s own a separate thread. This can significantly reduce processing time, particularly when you have a four (or even eight) core system. Even custom plugin tools that you have developed will be available for targeting with the Run Plugin In Parallel tool. Using the tool is fairly straightforward. You have to specify the name of the plugin tool that you are running. Here it is important to remember that it is the proper plugin name and not the descriptive name that you may see displayed in the tools listings. Normally the proper plugin name is the same or similar to the descriptive name but without spaces. If in doubt, you can always look at the source code. The second input parameter is a text file. Each line within the text file provides the parameters that are supplied to the tool for one run. The input parameters are going to be specific to the tool and are the same that are used when you call a tool from a script. The help documentation for each tool has a Scripting section that describes the input parameters required to run the tool from a script. The following is an example of a text file that could be used for to run a parallelized batch mode of the FD8 Flow Accumulation tool (FlowAccumulationFD8):

/Documents/Data/DEM1.dep, /Documents/Data/FlowAccum1.dep, 1, specific catchment area (sca), true, not specified
/Documents/Data/DEM2.dep, /Documents/Data/FlowAccum2.dep, 1, specific catchment area (sca), true, not specified
/Documents/Data/DEM3.dep, /Documents/Data/FlowAccum3.dep, 1, specific catchment area (sca), true, not specified
/Documents/Data/DEM4.dep, /Documents/Data/FlowAccum4.dep, 1, specific catchment area (sca), true, not specified

Notice that each line should contain entries for each of the input parameters for the tool as specified by the help documentation, which in this case include:

demFile, outputFile, exponent, outputType, logTransformOutput, threshold

All of the parameters required for one run must be contained on the same line. The above example can be used to run the FD8 Flow Accumulation four times, each run on a separate thread, with four different inputs and four outputs. It will effectively quarter the time required to process the data on a typical quad-core system. But the real benefit occurs when you have many more files that need processing. If you have a more complex workflow involving more than one operation, you can run this parallel batch mode for each step (i.e. process all of the depression filling operations and then process all of the flow accumulation operations). You can even call the Run Plugin In Parallel tool from a script to automate these types of operations. If you’re working at that level, I think you’ve earned the right to call yourself a GIS ninja! Leave your comments below and, as always, best wishes and happy geoprocessing.

Scripting Custom Whitebox GAT Plugin Tools

You may know that you can use scripting in Whitebox Geospatial Analysis Tools to automate your workflows, but did you know that you can also use the same in-application scripting to develop your own custom plugin tool? Of course, you can develop a custom tool using the Java programming language in an integrated development environment like Netbeans (see the How to create a plugin tool for Whitebox tutorial in the Help) but that is a bit of an involved process. By far, the easiest and most rapid way to create a new tool is to use Whitebox’s scripting functionality. You can write script-based tools to do any sort of spatial analysis function including manipulating raster data, shapefiles, and even LAS LiDAR point clouds. The best part of developing a script-based tool is that because you don’t need to recompile the program and add the jar file into the appropriate directory every time you make a change to the code, testing your tool on real-life data directly in the Whitebox environment speeds up the testing phase of development considerably.

Have you ever noticed that the tools listed in the Tools panel have two different icons–the ‘wrench’ (tool) and the ‘scroll’ (ScriptIcon2) icons? The ‘scroll’ icon designates a tool that is written as a script  using the Whitebox Scripter. Importantly, when you develop a tool using the built-in scripting functionality, Whitebox will treat it in exactly the same way that it treats compiled Java tools. This means that it will be listed in the Tool treeview and listings, it will have the same type of dialog user-interface, and it will even be automatically available to be called from other scripts. There are more than one hundred standard plugin tools distributed with Whitebox that have been developed using scripting. People frequently ask me how I’ve managed to write so many GIS tools into Whitebox (over 400 at this point) and script-based tools are my secret advantage.

One of the major differences between these tools and the compiled Java tools, is that they are fully editable. That is, you can open a live version of the source code of scripted tools directly in Whitebox and modify the code and the changes will be integrated as soon as you save the file. You simply need to right-click over the tool in the Tool treeview and select Edit Script. Having the ability to dig deep into the functionality of a tool and even experiment with the code is a large part of the open-access development strategy that the Whitebox project has developed. And the user doesn’t have to worry about breaking the code, because you can always fix any changes that you make to the code by reverting to the original version simply by right-clicking over the tool’s icon and selecting Update Script From Code Repository or by selecting Update Scripts From Repository in the Tools menu (this will also give you a preview of any new scripts that have been committed to the repository after the release version that you are using). So feel free to mess around with the code for various tools. Experiment, tweak, investigate, improve, and have fun with it!

These script-based plugin tools can be developed using any of the three supported scripting languages, including Python, JavaScript, and Groovy. The implementation of the Python programming language used by Whitebox is called Jython (Python 2.7 currently), which runs on the Java platform. Similarly, the JavaScript engine used by Whitebox is called Nashorn and is the newly built (for Java 8) scripting environment that is baked directly into Java. It’s a modern JavaScript engine that has significant performance advantages over the previous JavaScript engine (Rhino) used by older versions of Java. Nashorn is to the Java platform what V8 is to Chrome. Groovy is a scripting language that runs on the Java platform and is the most similar of the three languages to the Java programming language itself. In fact, many people refer to Groovy as a superset of Java, in that most valid Java code will also be executable Groovy (although Groovy has several advances that make it less verbose and generally much nicer to program in than Java itself).

Let’s consider a simple function, a 3 x 3 mean filter run over a raster image, as an example of how you would write a Whitebox plugin tool using each of the three languages. The function, which performs an average of the 9 grid cells surrounding each cell in an input raster and outputs the mean to the corresponding cell in an output raster, is quite typical of the type of analysis done in raster GIS operations. The following is the Python version of this function (please note that there is some kind of bug in WordPress that seems to substitute quotation marks, less than and greater than symbols with things like ‘"’ which may appear in the following code):

# imports
import time
import os
from threading import Thread
from whitebox.ui.plugin_dialog import ScriptDialog
from java.awt.event import ActionListener
from whitebox.geospatialfiles import WhiteboxRaster
from whitebox.geospatialfiles.WhiteboxRasterBase import DataType

'''The following four variables are required for this 
   script to be integrated into the tool tree panel. 
   Comment them out if you want to remove the script.'''
name = "PythonExamplePlugin" 
descriptiveName = "Example Python Plugin" 
description = "Just an example of a plugin tool using Python."
toolboxes = ["topmost"] 
	
class PythonExamplePlugin(ActionListener):
    def __init__(self, args):
        if len(args) != 0:
            self.execute(args)
        else:
            ''' Create a dialog for this tool to collect user-specified
                tool parameters.''' 
            self.sd = ScriptDialog(pluginHost, "Python Example Plugin", self)	
			
            ''' Specifying the help file will display the html help
            // file in the help pane. This file should be be located 
            // in the help directory and have the same name as the 
            // class, with an html extension.'''
            helpFile = self.__class__.__name__
            self.sd.setHelpFile(helpFile)
            
            ''' Specifying the source file allows the 'view code' 
            // button on the tool dialog to be displayed.'''
            self.sd.setSourceFile(os.path.abspath(__file__))
            
            # add some components to the dialog '''
            self.sd.addDialogFile("Input raster file", "Input Raster File:", "open", "Raster Files (*.dep), DEP", True, False)
            self.sd.addDialogFile("Output raster file", "Output Raster File:", "save", "Raster Files (*.dep), DEP", True, False)
            
            # Resize the dialog to the standard size and display it '''
            self.sd.setSize(800, 400)
            self.sd.visible = True
            
    def actionPerformed(self, event):
        if event.getActionCommand() == "ok":
            args = self.sd.collectParameters()
            t = Thread(target=lambda: self.execute(args))
            t.start()

    ''' The execute function is the main part of the tool, where the actual
    work is completed.'''
    def execute(self, args):
        try:
            dX = [ 1, 1, 1, 0, -1, -1, -1, 0 ]
            dY = [ -1, 0, 1, 1, 1, 0, -1, -1 ]
            
            if len(args) != 2:
                pluginHost.showFeedback("Incorrect number of arguments given to tool.")
                return
                
            # read the input parameters
            inputfile = args[0]
            outputfile = args[1]
            
            # read the input image 
            inputraster = WhiteboxRaster(inputfile, 'r')
            nodata = inputraster.getNoDataValue()
            rows = inputraster.getNumberRows()
            cols = inputraster.getNumberColumns()
            
            # initialize the output image
            outputraster = WhiteboxRaster(outputfile, "rw", inputfile, DataType.FLOAT, nodata)
            outputraster.setPreferredPalette(inputraster.getPreferredPalette())
            
            '''perform the analysis
            This code loops through a raster and performs a 
            3 x 3 mean filter.'''
            oldprogress = -1
            for row in xrange(0, rows):
                for col in xrange(0, cols):
                    z = inputraster.getValue(row, col)
                    if z != nodata:
                        mean = z
                        numneighbours = 1
                        for n in xrange(0, 8):
                            zn = inputraster.getValue(row + dY[n], col + dX[n])
                            if zn != nodata:
                                mean += zn
                                numneighbours += 1
                                
                        outputraster.setValue(row, col, mean / numneighbours)
                        
                    progress = (int)(100.0 * row / (rows - 1))
                    if progress != oldprogress:
                        oldprogress = progress
                        pluginHost.updateProgress(progress)
                        if pluginHost.isRequestForOperationCancelSet():
                            pluginHost.showFeedback("Operation cancelled")
                            return
			
            inputraster.close()
            outputraster.addMetadataEntry("Created by the " + descriptiveName + " tool.")
            outputraster.addMetadataEntry("Created on " + time.asctime())
            outputraster.close()

            # display the output image
            pluginHost.returnData(outputfile)
            
            except Exception, e:
                print e
                pluginHost.showFeedback("An error has occurred during operation. See log file for details.")
                pluginHost.logException("Error in " + descriptiveName, e)
                return
            finally:
                # reset the progress bar
                pluginHost.updateProgress(0)
	
if args is None:
    pluginHost.showFeedback("The arguments array has not been set.")
else:		
    PythonExamplePlugin(args)

Many of you GIS ‘Pythonistas’ will feel right at home looking at that code. After saving the code in the Scripter, you need to relaunch Whitebox before the program will recognize the tool and include it in its list of plugins (you only have to do this once). This is what the dialog looks like when you run the tool either by double-clicking the tool in the Tools treeview or by selecting Execute in the Scripter:

Tool dialog

Tool dialog (click to enlarge)

It has two input parameters, an input raster file name and the name of the output raster. To do this up right, we could write a help file for the tool and make sure that it is saved in the Whitebox Help directory. To do this, you simply need to select the Create New Help Entry button on the dialog and enter the HTML document.

Now, the following code is the equivalent JavaScript tool:

// imports
var Runnable = Java.type('java.lang.Runnable');
var Thread = Java.type('java.lang.Thread');
var ActionListener = Java.type('java.awt.event.ActionListener');
var ScriptDialog = Java.type('whitebox.ui.plugin_dialog.ScriptDialog');
var WhiteboxRaster = Java.type('whitebox.geospatialfiles.WhiteboxRaster');
var DataType = Java.type('whitebox.geospatialfiles.WhiteboxRasterBase.DataType');

// The following four variables are what make this recognizable as 
// a plugin tool for Whitebox. Each of name, descriptiveName, 
// description and toolboxes must be present.
var name = "JavascriptExamplePlugin";
var descriptiveName = "Example JavaScript tool";
var description = "Just an example of a plugin tool using JavaScript.";
var toolboxes = ["topmost"];

// Create a dialog for the tool
function createDialog(args) {
    if (args.length !== 0) {
        execute(args);
    } else {
        // create an ActionListener to handle the return from the dialog
        var ac = new ActionListener({
            actionPerformed: function(event) {
        if (event.getActionCommand() === "ok") {
            var args = sd.collectParameters();
            sd.dispose();
	    var r = new Runnable({
	        run: function() {
	            execute(args);
	        }
	    });
	    var t = new Thread(r);
	    t.start();
	    }
	}
	});

        // Create the scriptdialog object
        sd = new ScriptDialog(pluginHost, descriptiveName, ac);
        
        // Add some components to it
        sd.addDialogFile("Input raster file", "Input Raster File:", "open", "Raster Files (*.dep), DEP", true, false);
        sd.addDialogFile("Output raster file", "Output Raster File:", "save", "Raster Files (*.dep), DEP", true, false);
        
        // Specifying the help file will display the html help
        // file in the help pane. This file should be be located 
        // in the help directory and have the same name as the 
        // class, with an html extension.
        sd.setHelpFile(toolName);
        
        // Specifying the source file allows the 'view code' 
        // button on the tool dialog to be displayed.
        var scriptFile = pluginHost.getResourcesDirectory() + "plugins/Scripts/" + toolName + ".js";
        sd.setSourceFile(scriptFile);
        		
        // set the dialog size and make it visible
        sd.setSize(800, 400);
        sd.visible = true;
        return sd;
    }
}

// The execute function is the main part of the tool, where the actual
// work is completed.
function execute(args) {
    try {
        // declare  some variables for later
        var z, zn, mean;
        var numNeighbours;
        // read in the arguments
        if (args.length < 2) {
            pluginHost.showFeedback("The tool is being run without the correct number of parameters");
            return;
        }
        var inputFile = args[0];
        var outputFile = args[1];
        
        // setup the raster
        var input = new WhiteboxRaster(inputFile, "rw");
        var rows = input.getNumberRows();
        var cols = input.getNumberColumns();
        var nodata = input.getNoDataValue();
        var output = new WhiteboxRaster(outputFile, "rw", inputFile, DataType.FLOAT, nodata);
        output.setPreferredPalette(input.getPreferredPalette());
        
        /* perform the analysis
          This code loops through a raster and performs a 
 	3 x 3 mean filter. */
        var dX = [ 1, 1, 1, 0, -1, -1, -1, 0 ];
        var dY = [ -1, 0, 1, 1, 1, 0, -1, -1 ];
        var progress, oldProgress = -1;
        for (row = 0; row < rows; row++) {
            for (col = 0; col < cols; col++) {
                var z = input.getValue(row, col);
                if (z != nodata) {
                    mean = z;
                    numNeighbours = 1;
                    for (n = 0; n < 8; n++) {
                        zn = input.getValue(row + dY[n], col + dX[n]);
                        if (zn !== nodata) {
                            mean += zn;
                            numNeighbours++;
                        }
                    }
                    output.setValue(row, col, mean / numNeighbours);
                }
            }
            progress = row * 100.0 / (rows - 1);
            if (progress !== oldProgress) {
                pluginHost.updateProgress(progress);
                oldProgress = progress;
                // check to see if the user has requested a cancellation
                if (pluginHost.isRequestForOperationCancelSet()) {
                    pluginHost.showFeedback("Operation cancelled");
                    return;
                }
            }
        }
	
        input.close();
        output.addMetadataEntry("Created by the " + descriptiveName + " tool.");
        output.addMetadataEntry("Created on " + new Date());
        output.close();
        
        // display the output image
        pluginHost.returnData(outputFile);

    } catch (err) {
        pluginHost.showFeedback("An error has occurred:\n" + err);
        pluginHost.logException("Error in " + descriptiveName, err);
    } finally {
        // reset the progress bar
        pluginHost.updateProgress("Progress:", 0);
    }
}

if (args === null) {
    pluginHost.showFeedback("The arguments array has not been set.");
} else {
    var sd = createDialog(args);
}

And lastly, this is the equivalent Groovy code:

import java.awt.event.ActionListener
import java.awt.event.ActionEvent
import java.util.Date
import whitebox.interfaces.WhiteboxPluginHost
import whitebox.geospatialfiles.WhiteboxRaster
import whitebox.geospatialfiles.WhiteboxRasterBase.DataType
import whitebox.ui.plugin_dialog.ScriptDialog
import groovy.transform.CompileStatic

// The following four variables are required for this 
// script to be integrated into the tool tree panel. 
// Comment them out if you want to remove the script.
def name = "GroovyExamplePlugin"
def descriptiveName = "Example Groovy tool"
def description = "Just an example of a plugin tool using Groovy."
def toolboxes = ["topmost"]

public class GroovyExamplePlugin {
    private WhiteboxPluginHost pluginHost
    private String descriptiveName
    public GroovyExamplePlugin(WhiteboxPluginHost pluginHost, 
        String[] args, String name, String descriptiveName) {
        this.pluginHost = pluginHost;
        this.descriptiveName = descriptiveName;
        if (args.length > 0) {
            execute(args)
        } else {
            // create an ActionListener to handle the return from the dialog
            def ac = new ActionListener() {
                public void actionPerformed(ActionEvent event) {
                    if (event.getActionCommand().equals("ok")) {
                        args = sd.collectParameters()
                        sd.dispose()
                        final Runnable r = new Runnable() {
                            @Override
                            public void run() {
                                execute(args)
                            }
                        }
                        final Thread t = new Thread(r)
                        t.start()
                    }
                }
	    };
            
            // Create a dialog for this tool to collect user-specified
            // tool parameters.
            def sd = new ScriptDialog(pluginHost, descriptiveName, ac)	
            
            // Specifying the help file will display the html help
            // file in the help pane. This file should be be located 
            // in the help directory and have the same name as the 
            // class, with an html extension.
            sd.setHelpFile(name)
            
            // Specifying the source file allows the 'view code' 
            // button on the tool dialog to be displayed.
            def scriptFile = pluginHost.getResourcesDirectory() + "plugins" + File.separator + "Scripts" + File.separator + name + ".groovy"
            sd.setSourceFile(scriptFile)
            	
            // add some components to the dialog
            sd.addDialogFile("Input raster file", "Input Raster File:", "open", "Raster Files (*.dep), DEP", true, false)
            sd.addDialogFile("Output file", "Output Raster File:", "save", "Raster Files (*.dep), DEP", true, false)
            
            // resize the dialog to the standard size and display it
            sd.setSize(800, 400)
            sd.visible = true
        }
    }

    // The execute function is the main part of the tool, where the actual
    // work is completed.
    //@CompileStatic
    private void execute(String[] args) {
        try {
            int progress, oldProgress = -1, n, row, col, numNeighbours
            double z, zn, mean, nodata;
            int[] dX = [ 1, 1, 1, 0, -1, -1, -1, 0 ]
            int[] dY = [ -1, 0, 1, 1, 1, 0, -1, -1 ]
            
            if (args.length != 2) {
                pluginHost.showFeedback("Incorrect number of arguments given to tool.")
                return
            }
            // read the input parameters
            String inputFile = args[0]
            String outputFile = args[1]
            
            // read the input image
            WhiteboxRaster input = new WhiteboxRaster(inputFile, "r")
            nodata = input.getNoDataValue()
            int rows = input.getNumberRows()
            int cols = input.getNumberColumns()
            			
            // initialize the output image
            WhiteboxRaster output = new WhiteboxRaster(outputFile, "rw", inputFile, DataType.FLOAT, nodata)
            output.setPreferredPalette(input.getPreferredPalette());

            /* perform the analysis
            This code loops through a raster and performs a 
            3 x 3 mean filter. */
            for (row = 0; row < rows; row++) {
                for (col = 0; col < cols; col++) {
                    z = input.getValue(row, col);
                    if (z != nodata) {
                        mean = z;
                        numNeighbours = 1;
                        for (n = 0; n < 8; n++) {
                            zn = input.getValue(row + dY[n], col + dX[n]);
                            if (zn != nodata) {
                                mean += zn;
                                numNeighbours++;
                            }
                        }
                        output.setValue(row, col, mean / numNeighbours);
                    }
                }
                progress = (int)(100f * row / (rows - 1))
                if (progress != oldProgress) {
                    pluginHost.updateProgress(progress)
                    oldProgress = progress
                    // check to see if the user has requested a cancellation
                    if (pluginHost.isRequestForOperationCancelSet()) {
                        pluginHost.showFeedback("Operation cancelled")
                        return
                    }
                }
            }
			
            input.close()
            output.addMetadataEntry("Created by the " + descriptiveName + " tool.")
            output.addMetadataEntry("Created on " + new Date())
            output.close()
            
            // display the output image
            pluginHost.returnData(outputFile)
	
        } catch (Exception e) {
            pluginHost.showFeedback("An error has occurred during operation. See log file for details.")
            pluginHost.logException("Error in " + descriptiveName, e)
        } finally {
            // reset the progress bar
            pluginHost.updateProgress(0)
        }
    }
}

if (args == null) {
    pluginHost.showFeedback("Plugin arguments not set.")
} else {
    def myTool = new GroovyExamplePlugin(pluginHost, args, name, descriptiveName)
}

All three tools look identical and perform the exact same function. However, being developed using dynamically typed scripting languages, there is a performance penalty that exists compared with with the high-speed performance of a tool written in fast statically-typed Just-In-Time (JIT) compiled Java code. To compare the performance of each of our three identical plugin tools, I ran them each 10 times (I actually ran it 11 times and averaged the last 10 runs, to warm-up the JVM) on a 2,862 x 3,249 rows-by-columns raster grid and averaged the run time of each. Here are the results of the performance comparison:

Python: average time = 41.9 sec., lines of code = 124
JavaScript: average time = 9.8 sec., lines of code = 143
Groovy: average time = 3.0 sec., lines of code = 152

Much of the difference in the length of the programs (lines of code) are the result of the need to specify closing brackets in JavaScript and Groovy, compared to the elegant ‘meaningful whitespace’ of Python code. The real difference is obviously in the execution time of each of our three programs. The Python program was 4.3X slower than the JavaScript program and nearly 14X slower than the Groovy program. And here’s the best part; Groovy is actually an ‘optionally typed’ language, meaning that if you are looking to speed up performance even further, you have the option to statically compile various methods. Notice that commented out line in the Groovy source code, “//@CompileStatic”. Simply by removing the comments and running the program again, I was able to speed up the Groovy code even further:

Groovy (compile static): average time = 1.2 sec., lines of code = 152

That’s very nearly equivalent to the execution time of the program written in compiled Java code. That’s rather impressive! I know that in the GIS community, there is a great many Python programmers out there, and certainly with the popularity of web programming these days, there are even more JavaScript programmers. But if you’re wondering why so many of the 100+ script-based tools that are in Whitebox GAT are developed using a peculiarly named scripting language (Groovy) that you’ve probably never heard of before picking up Whitebox, that’s why. Truthfully, if you’re writing a tool that isn’t doing anything computationally demanding, then the Python and JavaScript are wonderful options to quickly build your tool. But if you’re doing something that involves intensive computation, perhaps consider writing the tool in Groovy. The nice thing about Whitebox is that you have the option and other than differences in performance, your tool will be treated by the program in exactly the same way. Ultimately, you should use the language that is most suited to the application at hand and the one that you are comfortable using.

Of course, if you develop a custom Whitebox plugin tool that you think might be useful for others, then consider donating your tool to the project so that it can be distributed to the whole community. To do so, simply e-mail me your source code and perhaps some data to perform testing on. Leave your comments below and, as always, best wishes and happy plugin tool writing!

The Nile River Basin from SRTM data

Someone asked the other day whether the cross-platform, free and open-source GIS Whitebox GAT can handle watershed delineation from massive, regional-scale DEMs. They had a particular interest in the Nile River basin. Heck, if you’re going to go big, why not go huge, right? So I decided to give it a try. First off, I used the Retrieve SRTM Data tool to download the approximately 800 SRTM 3-arcsecond (~90 m) tiles that make up the Nile River basin. This required some experimentation because my first attempt at doing so hit the boundary of the basin and I had to give it a second try. The tool downloaded each of the tiles and mosaicked them into a single large DEM. The final DEM was 45,601 rows by 25,201 columns (a little over 1.1 billion grid cells) and was 4.28 GB in size. I then used the new Breach Depressions (Fast) tool to hydrologically pre-process the DEM by removing artifact topographic depressions and flat areas (i.e. cells with no downslope neighbours). I used a D8 flow algorithm to calculate flow directions, perform flow accumulation, trace the flowpaths issuing from Lake Victoria (the White Nile) and Lake Tana (the Blue Nile), and lastly, to delineate the watershed. The result was this map:

The Nile River Basin (click to enlarge)

The Nile River Basin (click to enlarge)

To fully appreciate this amazing map, you need to enlarge it. Just to put a bit of perspective on the scale of this analysis, take a look at this one:

Nile River World Map (click to enlarge)

Nile River World Map (click to enlarge)

All of Europe has an area of approximately 10,180,000 km2 and the Nile River basin has an area of 3,400,000 km2. That is truly vast.

For my initial attempt, the one in which I truncated the watershed, I used my 13 inch Macbook Pro (2.8 GHz dual-core i7, 16 GB RAM, SSD). When I expanded the area, I also moved to my workstation (3.0 GHz 8-core Xeon, 64 GB RAM, SSD) just to speed up the process a little. I even extracted a long-profile for the White and Blue Nile, although I should have converted the distance units to metres:

Long profile for the Nile River, extracted from SRTM 3-arcsecond data (click to enlarge)

Long profile for the Nile River, extracted from SRTM 3-arcsecond data (click to enlarge).

It was the longest river that I have ever plotted a long profile for; of course, it is the longest river so I guess you can’t get much larger than that! I probably should have extracted the river network using a dispersive flow algorithm like Tarboton’s excellent D-infinity, since there are places where the river bifurcates (i.e. the river course splits), even before the delta. Nonetheless, I’m quite pleased with the result. In fact, I was quite surprised at how well the river course, extracted from the 90 m resolution SRTM DEM data, matched a mapped Nile River shapefile that I located:

Mapped vs. extracted river (click to enlarge)

Mapped vs. extracted river (click to enlarge)

Leave your comments below and, as always, best wishes and happy geoprocessing.

Whitebox GAT for ArcGIS users

A short time ago over on the Whitebox GAT listserv someone asked an excellent question. They asked whether there was such thing as a Whitebox GAT User’s Manual, since the transition to Whitebox GAT, when you’re coming from an ArcGIS background, can be a little difficult. Often there are corresponding tools in Whitebox GAT to those in ArcGIS but they will frequently (and sometimes frustratingly, I’d imagine) have different names. There are different workflows for doing similar tasks and in general you can feel a little like a fish out of water for a short while until you get use to the Whitebox GAT way of doing things.

I can certainly understand this perspective. To date, I’ve been the main contributor to the Whitebox GAT project and so it reflects my design decisions more than anything. And, let’s face it, I’m a little different. As I pointed out to the question asker, I have never in my past had a time when I have used ArcGIS on a day-to-day basis. I’ve used IDRISI extensively many years ago (and that fact is apparent in some aspects of Whitebox in the way certain things are done) but frankly I’m just not that familiar with ArcGIS. It doesn’t inform how I have developed Whitebox to a large extent.

While we’re still working on a draft of a comprehensive Whitebox GAT User’s Manual (thank you to Adam Bonnycastle for his efforts here), I have been busily compiling quick tutorials on common tasks in Whitebox, which I have been cleverly embedding in the Help menu:

Whitebox's Help menu

Whitebox’s Help menu

Notice, that list starting with ‘Getting the Whitebox source code’? I’m not sure to what extent this feature has been taken up, but it’s there. If you have any suggestions for other additions let me know either by emailing me or leaving a comment below. If you have any Whitebox GAT tutorials that you’ve created yourself and would like to donate to the project, also please let me know.

But this gets me to the main point of this blog. What I would like to do is offer a brief tutorial on the topic “Whitebox GAT for ArcGIS users“. It can be loosely formed as a list of common gottcha’s or ‘Them vs. Us’ comparisons. But as I’ve said above, I really don’t use ArcGIS, so I need to ask for your help. If you’re a Whitebox GAT user and are coming to it from an ArcGIS background, can you provide us with a list of the main difficulties you experienced when you first tried using Whitebox? We’re looking for some of the things that you had to spend time trying to figure out how to do. Wouldn’t it be great if you could help to make that transition a little easier for future Whitebox users? I’m a big believer in the idea of paying it forward. Leave your comments below and, as always, best wishes and happy geoprocessing.

Hexbinning in Whitebox GAT

The practice of binning point data to form a type of 2D histogram, density plot, or what is sometimes called a heatmap, is quite useful as an alternative for the cartographic display of of very dense points sets. This is particularly the case when the points experience significant overlap at the displayed scale. This is the type of operation that is used (in feature space) in Whitebox‘s Feature Space Plot tool. Normally when we perform binning, however, it is using a regular square or perhaps rectangular grid. The other day I read an interesting blog on hexbinning (see also the excellent post by Zachary Forest Johnson also on the topic of hexbinning), or the use of a hexagonal-shaped grid for the creation of density plots.

A Hex-binned heatmap

A Hex-binned heatmap (click to enlarge)

These blogs got me excited, in part because I have used hex-grids for various spatial operations in the past and am aware of several advantages to this tessellation structure. The linked hexbinning blogs point out that hexagonal grids are useful because the hexagon is the most circular-shaped polygon that tessellates. A consequence of this circularity is that hex grids tend to represent curves more naturally than square grids, which includes better representation of polygon boundaries during rasterization. But there are several other advantages that make hexbinning a worthwhile practice. For example, one consequence of the nearly circular shape of hexagons is that hex-grids are very closely packed compared with square grids. That is, the average spacing between hexagon centre points in a hex-grid is smaller than the equivalent average spacing in a square grid. One way of thinking about this characteristic is that it means that hexagonal grids require about 13% fewer data points then the square grid to represent a distribution at a comparable level of detail. Also, unlike a square grid, each cell in a hex-grid shares an equal-length boundary with its six neighbouring grid cells. With square grids, four of the eight neighbouring cells are connected through a point only. This causes all sorts of problems for spatial analysis, not the least of which is the characteristic orientation sensitivity of square grids; and don’t get me started on the effects of this characteristics for surface flow-path modelling on raster digital elevation models. Hex-grid cells are also equally distant to each of their six neighbours. I’ve even heard it argued before that given the shape of those cone and rod cells in our eyes, hex-grids are more naturally suited to the human visual system, although I’m not sure how true this is.

Hexagonal grids are certainly worthwhile data structures and hex-binning is a great way to make a heatmap. So, that said, I decided to write a tool to perform hex-binning in Whitebox GAT. The tool will be publicly available in the 3.2.1 release of the software, which will hopefully be out soon but here is a preview:

Whitebox's Hexbinning tool

Whitebox’s Hexbinning tool (click to enlarge)

The tool will take either an input shapefile of POINT or MULTIPOINT ShapeType, or a LiDAR LAS file and it will be housed in both the Vector Tools and LiDAR Tools toolboxes. Here is an example of a hex-binned density map (seen on right) derived using the tool applied to a distribution of 7.7 million points (seen on left) contained in a LAS file and derived using a terrestrial LiDAR system:

Hexbinned LiDAR points

Hexbinned LiDAR points (Click to enlarge)

Notice that the points in the image on the left are so incredibly dense in places that you cannot effectively see individual features; they overlap completely to form blanket areas of points. It wouldn’t matter how small I rendered the points, at the scale of the map, they would always coalesce into areas. The hex-binned heatmap is a far more effective way of visualizing the distribution of LiDAR points in this case.

The hexbinning tool is also very efficient. It took about two seconds to perform the binning on the 7.7 million LiDAR points above using my laptop. In the CUNY blog linked above, the author (I think it was Lee Hachadoorian) describes several problems that they ran into when they performed the hexbinning operation on their 285,000 points using QGIS. I suspect that their problems were at least partly the result of the fact that they performed the binning using point-in-polygon operations to identify the hexagon in which each point plotted. Point-in-polygon operations are computationally expensive and I think there is a better way here. A hex-grid is essentially the Voronoi diagram for the set of hexagon centre points, meaning that every location within a hexagon grid cell is nearest the hexagon’s centre than another other neighbouring cell centre point. As such, a more efficient way of performing hexbinning would be to enter each hexagon’s centre point into a KD-tree structure and perform a highly efficient nearest-neighbour operation on each of the data points to identify their enclosing hexagon. This is a fast operation in Whitebox and so the hexbinning tool works well even with massive point sets, as you commonly encounter when working with LiDAR data and other types of spatial data.

So I hope you have fun with this new tool and make some beautiful maps. Leave your comments below and, as always, best wishes and happy geoprocessing.

John Lindsay