Parallelization of GIS Operations in Whitebox GAT

I often get feedback from new users of Whitebox Geospatial Analysis Tools about how surprisingly fast it is for whatever task they are doing. However whenever I develop a new tool or function for Whitebox, I always take algorithm performance into consideration. A big part of that consideration these days is the potential for parallelizing all or parts of an operation to take advantage of the of multi-core processors that have become ubiquitous. There are many geoprocessing tools in Whitebox GAT that take advantage of concurrency, e.g the Clip tool, the various interpolation tools for LiDAR, and several others.

I try to take a balanced approach with respect to parallelization. Not every operation will necessarily benefit from parallelization (some will actually become slower) and some operations are simply unparallelizable, while many tools will only benefit when a certain bottleneck in the processing workflow are calculated concurrently. Another major consideration when I develop a new tool or function is managing memory requirements. People keep talking about Big Data these days like it’s a new phenomenon, but we in the geomatics community have been dealing with massive datasets for as long as there’s been a community. Often parallelizing an operation is possible but adds significant memory requirements that would make the parallelized version only applicable to smaller datasets that can easily fit in system memory. When this is the case, I’ll sometimes provide an option to the user to choose a parallelized or memory-optimized version. I must always consider the scalability of a tool during design.

There is one common scenario when you can really benefit from parallelizing your workflow. Have you ever found yourself in a situation where you have dozens, or maybe even hundreds of files all of which need to have the same operation applied? Perhaps you have hundreds of LiDAR tiles and need to apply a depression removal operation on them all. Maybe you have dozens of satellite images that need to have the same convolution filter applied. It’s a fairly common situation for a GIS tech to find themselves in. Of course you could always write a quick Python, Javascript or Groovy script to automate the process and save you the tedium of having to perform each operation manually. But unless you dedicate some time to writing that script just right, it’s likely going to run sequentially. All of those extra processing cores on your machine are just going to idle away. I have a machine with 8 cores and boy does it bother me when they’re not all in use at all times! Well Whitebox has an answer for this particular situation and it’s called Run Plugin In Parallel.

Run Plugin In Parallel tool

Run Plugin In Parallel tool (click to enlarge)

With this tool you can call any of Whitebox’s 400+ plugin tools in a type of parallel batch mode. Each call of the tool will run on it’s own a separate thread. This can significantly reduce processing time, particularly when you have a four (or even eight) core system. Even custom plugin tools that you have developed will be available for targeting with the Run Plugin In Parallel tool. Using the tool is fairly straightforward. You have to specify the name of the plugin tool that you are running. Here it is important to remember that it is the proper plugin name and not the descriptive name that you may see displayed in the tools listings. Normally the proper plugin name is the same or similar to the descriptive name but without spaces. If in doubt, you can always look at the source code. The second input parameter is a text file. Each line within the text file provides the parameters that are supplied to the tool for one run. The input parameters are going to be specific to the tool and are the same that are used when you call a tool from a script. The help documentation for each tool has a Scripting section that describes the input parameters required to run the tool from a script. The following is an example of a text file that could be used for to run a parallelized batch mode of the FD8 Flow Accumulation tool (FlowAccumulationFD8):

/Documents/Data/DEM1.dep, /Documents/Data/FlowAccum1.dep, 1, specific catchment area (sca), true, not specified
/Documents/Data/DEM2.dep, /Documents/Data/FlowAccum2.dep, 1, specific catchment area (sca), true, not specified
/Documents/Data/DEM3.dep, /Documents/Data/FlowAccum3.dep, 1, specific catchment area (sca), true, not specified
/Documents/Data/DEM4.dep, /Documents/Data/FlowAccum4.dep, 1, specific catchment area (sca), true, not specified

Notice that each line should contain entries for each of the input parameters for the tool as specified by the help documentation, which in this case include:

demFile, outputFile, exponent, outputType, logTransformOutput, threshold

All of the parameters required for one run must be contained on the same line. The above example can be used to run the FD8 Flow Accumulation four times, each run on a separate thread, with four different inputs and four outputs. It will effectively quarter the time required to process the data on a typical quad-core system. But the real benefit occurs when you have many more files that need processing. If you have a more complex workflow involving more than one operation, you can run this parallel batch mode for each step (i.e. process all of the depression filling operations and then process all of the flow accumulation operations). You can even call the Run Plugin In Parallel tool from a script to automate these types of operations. If you’re working at that level, I think you’ve earned the right to call yourself a GIS ninja! Leave your comments below and, as always, best wishes and happy geoprocessing.

Scripting Custom Whitebox GAT Plugin Tools

You may know that you can use scripting in Whitebox Geospatial Analysis Tools to automate your workflows, but did you know that you can also use the same in-application scripting to develop your own custom plugin tool? Of course, you can develop a custom tool using the Java programming language in an integrated development environment like Netbeans (see the How to create a plugin tool for Whitebox tutorial in the Help) but that is a bit of an involved process. By far, the easiest and most rapid way to create a new tool is to use Whitebox’s scripting functionality. You can write script-based tools to do any sort of spatial analysis function including manipulating raster data, shapefiles, and even LAS LiDAR point clouds. The best part of developing a script-based tool is that because you don’t need to recompile the program and add the jar file into the appropriate directory every time you make a change to the code, testing your tool on real-life data directly in the Whitebox environment speeds up the testing phase of development considerably.

Have you ever noticed that the tools listed in the Tools panel have two different icons–the ‘wrench’ (tool) and the ‘scroll’ (ScriptIcon2) icons? The ‘scroll’ icon designates a tool that is written as a script  using the Whitebox Scripter. Importantly, when you develop a tool using the built-in scripting functionality, Whitebox will treat it in exactly the same way that it treats compiled Java tools. This means that it will be listed in the Tool treeview and listings, it will have the same type of dialog user-interface, and it will even be automatically available to be called from other scripts. There are more than one hundred standard plugin tools distributed with Whitebox that have been developed using scripting. People frequently ask me how I’ve managed to write so many GIS tools into Whitebox (over 400 at this point) and script-based tools are my secret advantage.

One of the major differences between these tools and the compiled Java tools, is that they are fully editable. That is, you can open a live version of the source code of scripted tools directly in Whitebox and modify the code and the changes will be integrated as soon as you save the file. You simply need to right-click over the tool in the Tool treeview and select Edit Script. Having the ability to dig deep into the functionality of a tool and even experiment with the code is a large part of the open-access development strategy that the Whitebox project has developed. And the user doesn’t have to worry about breaking the code, because you can always fix any changes that you make to the code by reverting to the original version simply by right-clicking over the tool’s icon and selecting Update Script From Code Repository or by selecting Update Scripts From Repository in the Tools menu (this will also give you a preview of any new scripts that have been committed to the repository after the release version that you are using). So feel free to mess around with the code for various tools. Experiment, tweak, investigate, improve, and have fun with it!

These script-based plugin tools can be developed using any of the three supported scripting languages, including Python, JavaScript, and Groovy. The implementation of the Python programming language used by Whitebox is called Jython (Python 2.7 currently), which runs on the Java platform. Similarly, the JavaScript engine used by Whitebox is called Nashorn and is the newly built (for Java 8) scripting environment that is baked directly into Java. It’s a modern JavaScript engine that has significant performance advantages over the previous JavaScript engine (Rhino) used by older versions of Java. Nashorn is to the Java platform what V8 is to Chrome. Groovy is a scripting language that runs on the Java platform and is the most similar of the three languages to the Java programming language itself. In fact, many people refer to Groovy as a superset of Java, in that most valid Java code will also be executable Groovy (although Groovy has several advances that make it less verbose and generally much nicer to program in than Java itself).

Let’s consider a simple function, a 3 x 3 mean filter run over a raster image, as an example of how you would write a Whitebox plugin tool using each of the three languages. The function, which performs an average of the 9 grid cells surrounding each cell in an input raster and outputs the mean to the corresponding cell in an output raster, is quite typical of the type of analysis done in raster GIS operations. The following is the Python version of this function (please note that there is some kind of bug in WordPress that seems to substitute quotation marks, less than and greater than symbols with things like ‘"’ which may appear in the following code):

# imports
import time
import os
from threading import Thread
from whitebox.ui.plugin_dialog import ScriptDialog
from java.awt.event import ActionListener
from whitebox.geospatialfiles import WhiteboxRaster
from whitebox.geospatialfiles.WhiteboxRasterBase import DataType

'''The following four variables are required for this 
   script to be integrated into the tool tree panel. 
   Comment them out if you want to remove the script.'''
name = "PythonExamplePlugin" 
descriptiveName = "Example Python Plugin" 
description = "Just an example of a plugin tool using Python."
toolboxes = ["topmost"] 
	
class PythonExamplePlugin(ActionListener):
    def __init__(self, args):
        if len(args) != 0:
            self.execute(args)
        else:
            ''' Create a dialog for this tool to collect user-specified
                tool parameters.''' 
            self.sd = ScriptDialog(pluginHost, "Python Example Plugin", self)	
			
            ''' Specifying the help file will display the html help
            // file in the help pane. This file should be be located 
            // in the help directory and have the same name as the 
            // class, with an html extension.'''
            helpFile = self.__class__.__name__
            self.sd.setHelpFile(helpFile)
            
            ''' Specifying the source file allows the 'view code' 
            // button on the tool dialog to be displayed.'''
            self.sd.setSourceFile(os.path.abspath(__file__))
            
            # add some components to the dialog '''
            self.sd.addDialogFile("Input raster file", "Input Raster File:", "open", "Raster Files (*.dep), DEP", True, False)
            self.sd.addDialogFile("Output raster file", "Output Raster File:", "save", "Raster Files (*.dep), DEP", True, False)
            
            # Resize the dialog to the standard size and display it '''
            self.sd.setSize(800, 400)
            self.sd.visible = True
            
    def actionPerformed(self, event):
        if event.getActionCommand() == "ok":
            args = self.sd.collectParameters()
            t = Thread(target=lambda: self.execute(args))
            t.start()

    ''' The execute function is the main part of the tool, where the actual
    work is completed.'''
    def execute(self, args):
        try:
            dX = [ 1, 1, 1, 0, -1, -1, -1, 0 ]
            dY = [ -1, 0, 1, 1, 1, 0, -1, -1 ]
            
            if len(args) != 2:
                pluginHost.showFeedback("Incorrect number of arguments given to tool.")
                return
                
            # read the input parameters
            inputfile = args[0]
            outputfile = args[1]
            
            # read the input image 
            inputraster = WhiteboxRaster(inputfile, 'r')
            nodata = inputraster.getNoDataValue()
            rows = inputraster.getNumberRows()
            cols = inputraster.getNumberColumns()
            
            # initialize the output image
            outputraster = WhiteboxRaster(outputfile, "rw", inputfile, DataType.FLOAT, nodata)
            outputraster.setPreferredPalette(inputraster.getPreferredPalette())
            
            '''perform the analysis
            This code loops through a raster and performs a 
            3 x 3 mean filter.'''
            oldprogress = -1
            for row in xrange(0, rows):
                for col in xrange(0, cols):
                    z = inputraster.getValue(row, col)
                    if z != nodata:
                        mean = z
                        numneighbours = 1
                        for n in xrange(0, 8):
                            zn = inputraster.getValue(row + dY[n], col + dX[n])
                            if zn != nodata:
                                mean += zn
                                numneighbours += 1
                                
                        outputraster.setValue(row, col, mean / numneighbours)
                        
                    progress = (int)(100.0 * row / (rows - 1))
                    if progress != oldprogress:
                        oldprogress = progress
                        pluginHost.updateProgress(progress)
                        if pluginHost.isRequestForOperationCancelSet():
                            pluginHost.showFeedback("Operation cancelled")
                            return
			
            inputraster.close()
            outputraster.addMetadataEntry("Created by the " + descriptiveName + " tool.")
            outputraster.addMetadataEntry("Created on " + time.asctime())
            outputraster.close()

            # display the output image
            pluginHost.returnData(outputfile)
            
            except Exception, e:
                print e
                pluginHost.showFeedback("An error has occurred during operation. See log file for details.")
                pluginHost.logException("Error in " + descriptiveName, e)
                return
            finally:
                # reset the progress bar
                pluginHost.updateProgress(0)
	
if args is None:
    pluginHost.showFeedback("The arguments array has not been set.")
else:		
    PythonExamplePlugin(args)

Many of you GIS ‘Pythonistas’ will feel right at home looking at that code. After saving the code in the Scripter, you need to relaunch Whitebox before the program will recognize the tool and include it in its list of plugins (you only have to do this once). This is what the dialog looks like when you run the tool either by double-clicking the tool in the Tools treeview or by selecting Execute in the Scripter:

Tool dialog

Tool dialog (click to enlarge)

It has two input parameters, an input raster file name and the name of the output raster. To do this up right, we could write a help file for the tool and make sure that it is saved in the Whitebox Help directory. To do this, you simply need to select the Create New Help Entry button on the dialog and enter the HTML document.

Now, the following code is the equivalent JavaScript tool:

// imports
var Runnable = Java.type('java.lang.Runnable');
var Thread = Java.type('java.lang.Thread');
var ActionListener = Java.type('java.awt.event.ActionListener');
var ScriptDialog = Java.type('whitebox.ui.plugin_dialog.ScriptDialog');
var WhiteboxRaster = Java.type('whitebox.geospatialfiles.WhiteboxRaster');
var DataType = Java.type('whitebox.geospatialfiles.WhiteboxRasterBase.DataType');

// The following four variables are what make this recognizable as 
// a plugin tool for Whitebox. Each of name, descriptiveName, 
// description and toolboxes must be present.
var name = "JavascriptExamplePlugin";
var descriptiveName = "Example JavaScript tool";
var description = "Just an example of a plugin tool using JavaScript.";
var toolboxes = ["topmost"];

// Create a dialog for the tool
function createDialog(args) {
    if (args.length !== 0) {
        execute(args);
    } else {
        // create an ActionListener to handle the return from the dialog
        var ac = new ActionListener({
            actionPerformed: function(event) {
        if (event.getActionCommand() === "ok") {
            var args = sd.collectParameters();
            sd.dispose();
	    var r = new Runnable({
	        run: function() {
	            execute(args);
	        }
	    });
	    var t = new Thread(r);
	    t.start();
	    }
	}
	});

        // Create the scriptdialog object
        sd = new ScriptDialog(pluginHost, descriptiveName, ac);
        
        // Add some components to it
        sd.addDialogFile("Input raster file", "Input Raster File:", "open", "Raster Files (*.dep), DEP", true, false);
        sd.addDialogFile("Output raster file", "Output Raster File:", "save", "Raster Files (*.dep), DEP", true, false);
        
        // Specifying the help file will display the html help
        // file in the help pane. This file should be be located 
        // in the help directory and have the same name as the 
        // class, with an html extension.
        sd.setHelpFile(toolName);
        
        // Specifying the source file allows the 'view code' 
        // button on the tool dialog to be displayed.
        var scriptFile = pluginHost.getResourcesDirectory() + "plugins/Scripts/" + toolName + ".js";
        sd.setSourceFile(scriptFile);
        		
        // set the dialog size and make it visible
        sd.setSize(800, 400);
        sd.visible = true;
        return sd;
    }
}

// The execute function is the main part of the tool, where the actual
// work is completed.
function execute(args) {
    try {
        // declare  some variables for later
        var z, zn, mean;
        var numNeighbours;
        // read in the arguments
        if (args.length < 2) {
            pluginHost.showFeedback("The tool is being run without the correct number of parameters");
            return;
        }
        var inputFile = args[0];
        var outputFile = args[1];
        
        // setup the raster
        var input = new WhiteboxRaster(inputFile, "rw");
        var rows = input.getNumberRows();
        var cols = input.getNumberColumns();
        var nodata = input.getNoDataValue();
        var output = new WhiteboxRaster(outputFile, "rw", inputFile, DataType.FLOAT, nodata);
        output.setPreferredPalette(input.getPreferredPalette());
        
        /* perform the analysis
          This code loops through a raster and performs a 
 	3 x 3 mean filter. */
        var dX = [ 1, 1, 1, 0, -1, -1, -1, 0 ];
        var dY = [ -1, 0, 1, 1, 1, 0, -1, -1 ];
        var progress, oldProgress = -1;
        for (row = 0; row < rows; row++) {
            for (col = 0; col < cols; col++) {
                var z = input.getValue(row, col);
                if (z != nodata) {
                    mean = z;
                    numNeighbours = 1;
                    for (n = 0; n < 8; n++) {
                        zn = input.getValue(row + dY[n], col + dX[n]);
                        if (zn !== nodata) {
                            mean += zn;
                            numNeighbours++;
                        }
                    }
                    output.setValue(row, col, mean / numNeighbours);
                }
            }
            progress = row * 100.0 / (rows - 1);
            if (progress !== oldProgress) {
                pluginHost.updateProgress(progress);
                oldProgress = progress;
                // check to see if the user has requested a cancellation
                if (pluginHost.isRequestForOperationCancelSet()) {
                    pluginHost.showFeedback("Operation cancelled");
                    return;
                }
            }
        }
	
        input.close();
        output.addMetadataEntry("Created by the " + descriptiveName + " tool.");
        output.addMetadataEntry("Created on " + new Date());
        output.close();
        
        // display the output image
        pluginHost.returnData(outputFile);

    } catch (err) {
        pluginHost.showFeedback("An error has occurred:\n" + err);
        pluginHost.logException("Error in " + descriptiveName, err);
    } finally {
        // reset the progress bar
        pluginHost.updateProgress("Progress:", 0);
    }
}

if (args === null) {
    pluginHost.showFeedback("The arguments array has not been set.");
} else {
    var sd = createDialog(args);
}

And lastly, this is the equivalent Groovy code:

import java.awt.event.ActionListener
import java.awt.event.ActionEvent
import java.util.Date
import whitebox.interfaces.WhiteboxPluginHost
import whitebox.geospatialfiles.WhiteboxRaster
import whitebox.geospatialfiles.WhiteboxRasterBase.DataType
import whitebox.ui.plugin_dialog.ScriptDialog
import groovy.transform.CompileStatic

// The following four variables are required for this 
// script to be integrated into the tool tree panel. 
// Comment them out if you want to remove the script.
def name = "GroovyExamplePlugin"
def descriptiveName = "Example Groovy tool"
def description = "Just an example of a plugin tool using Groovy."
def toolboxes = ["topmost"]

public class GroovyExamplePlugin {
    private WhiteboxPluginHost pluginHost
    private String descriptiveName
    public GroovyExamplePlugin(WhiteboxPluginHost pluginHost, 
        String[] args, String name, String descriptiveName) {
        this.pluginHost = pluginHost;
        this.descriptiveName = descriptiveName;
        if (args.length > 0) {
            execute(args)
        } else {
            // create an ActionListener to handle the return from the dialog
            def ac = new ActionListener() {
                public void actionPerformed(ActionEvent event) {
                    if (event.getActionCommand().equals("ok")) {
                        args = sd.collectParameters()
                        sd.dispose()
                        final Runnable r = new Runnable() {
                            @Override
                            public void run() {
                                execute(args)
                            }
                        }
                        final Thread t = new Thread(r)
                        t.start()
                    }
                }
	    };
            
            // Create a dialog for this tool to collect user-specified
            // tool parameters.
            def sd = new ScriptDialog(pluginHost, descriptiveName, ac)	
            
            // Specifying the help file will display the html help
            // file in the help pane. This file should be be located 
            // in the help directory and have the same name as the 
            // class, with an html extension.
            sd.setHelpFile(name)
            
            // Specifying the source file allows the 'view code' 
            // button on the tool dialog to be displayed.
            def scriptFile = pluginHost.getResourcesDirectory() + "plugins" + File.separator + "Scripts" + File.separator + name + ".groovy"
            sd.setSourceFile(scriptFile)
            	
            // add some components to the dialog
            sd.addDialogFile("Input raster file", "Input Raster File:", "open", "Raster Files (*.dep), DEP", true, false)
            sd.addDialogFile("Output file", "Output Raster File:", "save", "Raster Files (*.dep), DEP", true, false)
            
            // resize the dialog to the standard size and display it
            sd.setSize(800, 400)
            sd.visible = true
        }
    }

    // The execute function is the main part of the tool, where the actual
    // work is completed.
    //@CompileStatic
    private void execute(String[] args) {
        try {
            int progress, oldProgress = -1, n, row, col, numNeighbours
            double z, zn, mean, nodata;
            int[] dX = [ 1, 1, 1, 0, -1, -1, -1, 0 ]
            int[] dY = [ -1, 0, 1, 1, 1, 0, -1, -1 ]
            
            if (args.length != 2) {
                pluginHost.showFeedback("Incorrect number of arguments given to tool.")
                return
            }
            // read the input parameters
            String inputFile = args[0]
            String outputFile = args[1]
            
            // read the input image
            WhiteboxRaster input = new WhiteboxRaster(inputFile, "r")
            nodata = input.getNoDataValue()
            int rows = input.getNumberRows()
            int cols = input.getNumberColumns()
            			
            // initialize the output image
            WhiteboxRaster output = new WhiteboxRaster(outputFile, "rw", inputFile, DataType.FLOAT, nodata)
            output.setPreferredPalette(input.getPreferredPalette());

            /* perform the analysis
            This code loops through a raster and performs a 
            3 x 3 mean filter. */
            for (row = 0; row < rows; row++) {
                for (col = 0; col < cols; col++) {
                    z = input.getValue(row, col);
                    if (z != nodata) {
                        mean = z;
                        numNeighbours = 1;
                        for (n = 0; n < 8; n++) {
                            zn = input.getValue(row + dY[n], col + dX[n]);
                            if (zn != nodata) {
                                mean += zn;
                                numNeighbours++;
                            }
                        }
                        output.setValue(row, col, mean / numNeighbours);
                    }
                }
                progress = (int)(100f * row / (rows - 1))
                if (progress != oldProgress) {
                    pluginHost.updateProgress(progress)
                    oldProgress = progress
                    // check to see if the user has requested a cancellation
                    if (pluginHost.isRequestForOperationCancelSet()) {
                        pluginHost.showFeedback("Operation cancelled")
                        return
                    }
                }
            }
			
            input.close()
            output.addMetadataEntry("Created by the " + descriptiveName + " tool.")
            output.addMetadataEntry("Created on " + new Date())
            output.close()
            
            // display the output image
            pluginHost.returnData(outputFile)
	
        } catch (Exception e) {
            pluginHost.showFeedback("An error has occurred during operation. See log file for details.")
            pluginHost.logException("Error in " + descriptiveName, e)
        } finally {
            // reset the progress bar
            pluginHost.updateProgress(0)
        }
    }
}

if (args == null) {
    pluginHost.showFeedback("Plugin arguments not set.")
} else {
    def myTool = new GroovyExamplePlugin(pluginHost, args, name, descriptiveName)
}

All three tools look identical and perform the exact same function. However, being developed using dynamically typed scripting languages, there is a performance penalty that exists compared with with the high-speed performance of a tool written in fast statically-typed Just-In-Time (JIT) compiled Java code. To compare the performance of each of our three identical plugin tools, I ran them each 10 times (I actually ran it 11 times and averaged the last 10 runs, to warm-up the JVM) on a 2,862 x 3,249 rows-by-columns raster grid and averaged the run time of each. Here are the results of the performance comparison:

Python: average time = 41.9 sec., lines of code = 124
JavaScript: average time = 9.8 sec., lines of code = 143
Groovy: average time = 3.0 sec., lines of code = 152

Much of the difference in the length of the programs (lines of code) are the result of the need to specify closing brackets in JavaScript and Groovy, compared to the elegant ‘meaningful whitespace’ of Python code. The real difference is obviously in the execution time of each of our three programs. The Python program was 4.3X slower than the JavaScript program and nearly 14X slower than the Groovy program. And here’s the best part; Groovy is actually an ‘optionally typed’ language, meaning that if you are looking to speed up performance even further, you have the option to statically compile various methods. Notice that commented out line in the Groovy source code, “//@CompileStatic”. Simply by removing the comments and running the program again, I was able to speed up the Groovy code even further:

Groovy (compile static): average time = 1.2 sec., lines of code = 152

That’s very nearly equivalent to the execution time of the program written in compiled Java code. That’s rather impressive! I know that in the GIS community, there is a great many Python programmers out there, and certainly with the popularity of web programming these days, there are even more JavaScript programmers. But if you’re wondering why so many of the 100+ script-based tools that are in Whitebox GAT are developed using a peculiarly named scripting language (Groovy) that you’ve probably never heard of before picking up Whitebox, that’s why. Truthfully, if you’re writing a tool that isn’t doing anything computationally demanding, then the Python and JavaScript are wonderful options to quickly build your tool. But if you’re doing something that involves intensive computation, perhaps consider writing the tool in Groovy. The nice thing about Whitebox is that you have the option and other than differences in performance, your tool will be treated by the program in exactly the same way. Ultimately, you should use the language that is most suited to the application at hand and the one that you are comfortable using.

Of course, if you develop a custom Whitebox plugin tool that you think might be useful for others, then consider donating your tool to the project so that it can be distributed to the whole community. To do so, simply e-mail me your source code and perhaps some data to perform testing on. Leave your comments below and, as always, best wishes and happy plugin tool writing!

Workflow Automation Part 2

In my earlier post on workflow automation in Whitebox, Simon Seibert left a comment asking, “I would like to know if it possible to include for loops in the Scripter as well? I would like to run the same script over many files. Could you also provide an example for such a problem?” Well Simon, yes you can. Here is an example of a simple Python script that finds all of the raster files contained within the current working directory and then performs a very simple analysis on each file:

import os
# The following code will find each raster
# file (.dep) in the working directory and
# then run a mean filter on the image.
wd = pluginHost.getWorkingDirectory()
a = 1
for file in os.listdir(wd):
  if file.endswith(".dep"):
    inputFile = wd + file
    outputFile = wd +"output" + str(a) + ".dep"
    a += 1
    xDim = "3"
    yDim = "3"
    rounded = "false"
    reflectEdges = "true"
    args = [inputFile, outputFile, xDim, yDim, rounded, reflectEdges]
    pluginHost.runPlugin("FilterMean", args, False)

If you want to take it to the next level, you can parallelize the script so that each iteration is run on a separate thread. So, there you have it. Leave your comments below and, as always, best wishes and happy geoprocessing.

Workflow Automation in Whitebox GAT

(This is part 1 of a multi-part post. See Workflow Automation Part 2 here.)

Okay, Whitebox GAT provides users with a very user-friendly means of accessing geoprocessing tools through dialog boxes with built-in help documentation. The dialog boxes are very consistent in design and the tools are each readily accessible from the Tools tab in the tool tree-view and the quick-search lists. This likely provides the most convenient way to access this functionality for most of your day-to-day work. Every now and then however you’ll find yourself in a situation where you have a workflow with numerous intermediate steps and you need to repeat those steps several times over. This is a case for automation. Every plugin tool in Whitebox can be run automatically  from a script. In fact, the help documentation for each tool provides a description of how that tool can be scripted. In the longer term I intend to develop a graphical workflow automator that will automatically generate scripts based on a workflow graph that the user creates. This type of functionality, similar to what is found in ArcGIS, Idrisi, and a few others, can be quite helpful for automating certain tasks. They are never as powerful however as directly scripting a workflow and that is why so many GIS users these days have become comfortable with scripting. It’s an extremely powerful tool for the GIS analyst to have.

Whitebox GAT currently provides scripting support for three programming languages including Python (actually Jython), Groovy, and JavaScript. I know that because ArcGIS supports Python scripting there are many GISers out there that are familiar with Python. Those of you who have been paying close attention to Whitebox development may have noticed that I have started to write a lot of new plugin tools using Groovy (most of Whitebox is developed using Java). Why Groovy instead of Python, you may ask? Well Groovy is a superset of Java, meaning that most valid Java is valid Groovy (if you know Java you already know Groovy). Groovy, being a scripting language, has some syntax advantages that make it much terser than regular Java. Compared with Jython, the Groovy project also seems to be more actively maintained and certainly has better computational performance, particularly with the optional static compilation. So for writing full-blown plugin tools, Groovy makes a lot of sense. (As an aside, how can you not like a language with such a great name?) But for simply automating workflows, I know that most of you are going to be looking at good old Python.

So, that said, below is a Python script that provides an example of how to automate a workflow. In this case, it automates the very common task in spatial hydrology of taking in a digital elevation model, removing the topographic depressions, calculating the flow directions, calculating the flow accumulation grid, then extracting streams. The script could be made much terser but I added a lot of comments to clarify. Simply paste the script into the Whitebox Scripter and press Execute Code. Scripting in Whitebox GAT can be a heck of a lot of fun and awfully addictive once you realize how much power you have at your fingertips! As always, best wishes and happy geoprocessing.

'''
Copyright (C) 2014 Dr. John Lindsay &amp;lt;jlindsay@uoguelph.ca&amp;gt;

This program is intended for instructional purposes only. The
following is an example of how to use Whitebox's scripting
capabilities to automate a geoprocessing workflow. The scripting
language is Python; more specifically it is Jython, the Python
implementation targeting the Java Virtual Machine (JVM).

In this script, we will take a digital elevation model (DEM),
remove all the topographic depressions from it (i.e. hydrologically
correct the DEM), calculate a flow direction pointer grid, use
the pointer file to perform a flow accumulation (i.e. upslope area)
calculation, then threshold the upslope area to derive valley lines
or streams. This is a fairly common workflow in spatial hydrology.

When you run a script from within Whitebox, a reference to the
Whitebox user interface (UI) will be automatically bound to your
script. It's variable name is 'pluginHost'. This is the primary
reason why the script must be run from within Whitebox's Scripter.

First we need the directory containing the data, and to set
the working directory to this. We will use the Vermont DEM contained
within the samples directory.
'''

import os

separator = os.sep # The system-specific directory separator
wd = pluginHost.getApplicationDirectory() + separator + &amp;quot;resources&amp;quot; + separator + &amp;quot;samples&amp;quot; + separator + &amp;quot;Vermont DEM&amp;quot; + separator
pluginHost.setWorkingDirectory(wd)

demFile = wd + &amp;quot;Vermont DEM.dep&amp;quot;
# Notice that spaces are allowed in file names. There is also no
# restriction on the length of the file name...in fact longer,
# descriptive names are preferred. Whitebox is friendly!

# A raster or vector file can be displayed by specifying the file
# name as an argument of the returnData method of the pluginHost
pluginHost.returnData(demFile)

'''
Remove the depressions in the DEM using the 'FillDepressions' tool.

The help file for each tool in Whitebox contains a section detailing
the required input parameters needed to run the tool from a script.
These parameters are always fed to the tool in a String array, in
the case below, called 'args'. The tool is then run using the 'runPlugin'
method of the pluginHost. runPlugin takes the name of the tool (see
the tool's help for the proper name), the arguments string array,
followed by two Boolean arguments. The first of these Boolean
arguments determines whether the plugin will be run on its own
separate thread. In most scripting applications, this should be set
to 'False' because the results of this tool are needed as inputs to
subsequent tools. The second Boolean argument specifies whether the
data that are returned to the pluginHost after the tool is completed
should be suppressed. Many tools will automatically display images
or shapefiles or some text report when they've completed. It is often
the case in a workflow that you only want the final result to be
displayed, in which case all of the runPlugins should have this final
Boolean parameter set to 'True' except for the last operation, for
which it should be set to 'False' (i.e. don't suppress the output).
The data will still be written to disc if the output are supressed,
they simply won't be automatically displayed when the tool has
completed. If you don't specify this last Boolean parameter, the
output will be treated as normal.
'''
filledDEMFile = wd + &amp;quot;filled DEM.dep&amp;quot;
flatIncrement = &amp;quot;0.001&amp;quot; # Notice that although this is a numeric parameter, it is provided to the tool as a string.
args = [demFile, filledDEMFile, flatIncrement]
pluginHost.runPlugin(&amp;quot;FillDepressions&amp;quot;, args, False, True)

# Calculate the D8 pointer (flow direction) file.
pointerFile = wd + &amp;quot;pointer.dep&amp;quot;
args = [filledDEMFile, pointerFile]
pluginHost.runPlugin(&amp;quot;FlowPointerD8&amp;quot;, args, False, True)

# Perform the flow accumulation operation.
flowAccumFile = wd + &amp;quot;flow accumulation.dep&amp;quot;
outputType = &amp;quot;number of upslope grid cells&amp;quot;
logTransformOutput = &amp;quot;False&amp;quot;
args = [pointerFile, flowAccumFile, outputType, logTransformOutput]
pluginHost.runPlugin(&amp;quot;FlowAccumD8&amp;quot;, args, False, True)

# Extract the streams
streamsFile = wd + &amp;quot;streams.dep&amp;quot;
channelThreshold = &amp;quot;1000.0&amp;quot;
backValue = &amp;quot;NoData&amp;quot;
args = [flowAccumFile, streamsFile, channelThreshold, backValue]
pluginHost.runPlugin(&amp;quot;ExtractStreams&amp;quot;, args, False, False) # This final result will be displayed

'''
Note that in each of the examples above, I have created new variables
to hold each of the input parameters for the plugin tools. I've done
this more for clarity than anything else. The script could be
substantially shorted if the shorter variables were directly entered
into the args array. For instance, I could have easily used:

args = [flowAccumFile, streamsFile, &amp;quot;1000.0&amp;quot;, &amp;quot;NoData&amp;quot;]

for the last runPlugin and saved myself declaring the two variables.
Because the file names are generally used in subsequent operations,
it is a good idea to dedicate variables to those parameters.
'''
Whitebox Scripter

Whitebox Scripter

Edit
Catching Bugs Before They Bug You:

When you’re writing a script in Whitebox, if your program throws an error, the software will record the error in your log files. The log files are xml files contained within the logs directory within the main Whitebox folder. They are detailed printouts of exactly what was happening around the time that the exception was thrown. They can certainly be challenging to read. An easier way of dealing with this problem is to incorporate exception handling in your script. Here’s a brief example:

try:
	i = 14
	k = &quot;24&quot;
	j = i + k # Throws an error
except Exception, e:
	print e
	pluginHost.showFeedback(&quot;Error during script execution.&quot;)
	''' alternatively, you many want to send the exception to 
	the pluginHost.logException() method '''
finally:
	print &quot;I'm done!&quot;

This code will print the following statement to the Whitebox Scripter console:

TypeError(“unsupported operand type(s) for +: ‘int’ and ‘unicode'”,)