Selecting features by attributes in Whitebox GAT

The latest released version of Whitebox GAT (3.1.3) has greatly improved feature selection capabilities, including feature selection by attributes built directly into the vector attributes table. The following post is a brief tutorial to guide you through the process of how this works.

To select features based on attributes contained with the attributes table, first be sure that the vector layer of interest is selected as the active map layer in the Layers tab. Now open the attribute table of the vector layer either by selecting the View Attribute Table icon located on the toolbar, or by selecting View Attribute Table from the Data Layers menu. Click on the Feature Selection tab (see below).

Selection by attribute

Example of selection by attribute

The first important feature of this tab pane is the Execute Code icon (green arrow) used to run a selection command. You will also notice a drop-down menu that can be used to specify the type of selection operation including the following four modes:

  • Create new selection
  • Add to current selection
  • Remove from current selection
  • Select from current selection

Each of the various modes differ in the way that the newly identified features are related to any features that are already selected. The default mode of Create new selection will simply ignore any existing selection and create a new selection set.

There is also a drop-down menu containing the names of each of the fields within the attribute table. If you select a field from the field name menu, the name will automatically be inserted into the selection script text area. The selection script text area has an auto-complete feature that allows users to press Control (Ctrl) and the space bar (or Command + space on a Mac) after having typed one or a few letters of the field name and a helpful pop-up window will appear from which you can select the correct field name. This auto-complete feature is also available to help identify all of the methods that are associated with fields containing strings (e.g. startsWith, toLowerCase, equals, contains, etc.). If the selected field is a text string or is a numeric integer value with fewer than 300 unique values within the table these values will appear listed in the ‘Unique Values’ drop-down value. Selecting a unique value will result in it being inserted into the selection script text area as well.

The operators drop-down menu provides a quick-link list to many of the commonly used operators. Importantly, the selection is based on Groovy-language scripting. Groovy is a super-set of the Java programming language and provides substantial power to the feature selection process. The common logical and comparison operators that are used in a selection script include:

  • & (logical AND operator)
  • | (logical OR operator)
  • == (equal-to comparison operator for numeric data types; for text strings use str1.equals(str2) method)
  • != (not-equal-to comparison operatorfor numeric data types; for text strings use !str1.equals(str2) method))
  • > (greater-than comparison operator)
  • >= (greater-than-equal-to comparison operator)
  • < (greater-than comparison operator)
  • <= (greater-than-equal-to comparison operator)

However, any valid Groovy code is allowable in the selection script. The selection script is typically a single line long, effectively a single command, although multiline scripts are accepted as well. Importantly, the last line of the script must evaluate to a Boolean (true or false) value. Each row in the attributes table will be evaluated and if the final line expression evaluates to ‘true‘, the corresponding feature will be selected. The values within each of the fields for a row are assigned to variables with the same name as the field (case sensitive). There is also an extra variable available within the script called ‘index‘ which is the row (feature) number. For example, the following selection script:

index < 100 

would select the first 100 features in the vector layer. More complex selections are also possible. For example,

 
NAME.startsWith("C") & POP2005 > 25000000 & LAT > 0.0

selects the features that have NAME starting with the letter U, POP2005 greater than 25,000,000 and LAT greater than 0.0. And,

CLASS.equals("agriculture") | CLASS.equals("forest")

selects all features that have an attribute CLASS value of either agriculture or forest.

After performing a selection, you will find that a filter has been applied to the attribute table such that only the selected features are displayed. The Options menu allows you to remove this filter, to copy the selected features to the clip board (they can then be pasted into a spreadsheet program as comma-separated values (CSV) text), or even save the features as a separate vector layer. The selected features will also be rendered on the map with a cyan coloured outline. That’s it. I think it’s rather straightforward and hopefully you do too. As always, best wishes and happy geoprocessing.

Advertisements

Workflow Automation in Whitebox GAT

(This is part 1 of a multi-part post. See Workflow Automation Part 2 here.)

Okay, Whitebox GAT provides users with a very user-friendly means of accessing geoprocessing tools through dialog boxes with built-in help documentation. The dialog boxes are very consistent in design and the tools are each readily accessible from the Tools tab in the tool tree-view and the quick-search lists. This likely provides the most convenient way to access this functionality for most of your day-to-day work. Every now and then however you’ll find yourself in a situation where you have a workflow with numerous intermediate steps and you need to repeat those steps several times over. This is a case for automation. Every plugin tool in Whitebox can be run automatically  from a script. In fact, the help documentation for each tool provides a description of how that tool can be scripted. In the longer term I intend to develop a graphical workflow automator that will automatically generate scripts based on a workflow graph that the user creates. This type of functionality, similar to what is found in ArcGIS, Idrisi, and a few others, can be quite helpful for automating certain tasks. They are never as powerful however as directly scripting a workflow and that is why so many GIS users these days have become comfortable with scripting. It’s an extremely powerful tool for the GIS analyst to have.

Whitebox GAT currently provides scripting support for three programming languages including Python (actually Jython), Groovy, and JavaScript. I know that because ArcGIS supports Python scripting there are many GISers out there that are familiar with Python. Those of you who have been paying close attention to Whitebox development may have noticed that I have started to write a lot of new plugin tools using Groovy (most of Whitebox is developed using Java). Why Groovy instead of Python, you may ask? Well Groovy is a superset of Java, meaning that most valid Java is valid Groovy (if you know Java you already know Groovy). Groovy, being a scripting language, has some syntax advantages that make it much terser than regular Java. Compared with Jython, the Groovy project also seems to be more actively maintained and certainly has better computational performance, particularly with the optional static compilation. So for writing full-blown plugin tools, Groovy makes a lot of sense. (As an aside, how can you not like a language with such a great name?) But for simply automating workflows, I know that most of you are going to be looking at good old Python.

So, that said, below is a Python script that provides an example of how to automate a workflow. In this case, it automates the very common task in spatial hydrology of taking in a digital elevation model, removing the topographic depressions, calculating the flow directions, calculating the flow accumulation grid, then extracting streams. The script could be made much terser but I added a lot of comments to clarify. Simply paste the script into the Whitebox Scripter and press Execute Code. Scripting in Whitebox GAT can be a heck of a lot of fun and awfully addictive once you realize how much power you have at your fingertips! As always, best wishes and happy geoprocessing.

'''
Copyright (C) 2014 Dr. John Lindsay &amp;lt;jlindsay@uoguelph.ca&amp;gt;

This program is intended for instructional purposes only. The
following is an example of how to use Whitebox's scripting
capabilities to automate a geoprocessing workflow. The scripting
language is Python; more specifically it is Jython, the Python
implementation targeting the Java Virtual Machine (JVM).

In this script, we will take a digital elevation model (DEM),
remove all the topographic depressions from it (i.e. hydrologically
correct the DEM), calculate a flow direction pointer grid, use
the pointer file to perform a flow accumulation (i.e. upslope area)
calculation, then threshold the upslope area to derive valley lines
or streams. This is a fairly common workflow in spatial hydrology.

When you run a script from within Whitebox, a reference to the
Whitebox user interface (UI) will be automatically bound to your
script. It's variable name is 'pluginHost'. This is the primary
reason why the script must be run from within Whitebox's Scripter.

First we need the directory containing the data, and to set
the working directory to this. We will use the Vermont DEM contained
within the samples directory.
'''

import os

separator = os.sep # The system-specific directory separator
wd = pluginHost.getApplicationDirectory() + separator + &amp;quot;resources&amp;quot; + separator + &amp;quot;samples&amp;quot; + separator + &amp;quot;Vermont DEM&amp;quot; + separator
pluginHost.setWorkingDirectory(wd)

demFile = wd + &amp;quot;Vermont DEM.dep&amp;quot;
# Notice that spaces are allowed in file names. There is also no
# restriction on the length of the file name...in fact longer,
# descriptive names are preferred. Whitebox is friendly!

# A raster or vector file can be displayed by specifying the file
# name as an argument of the returnData method of the pluginHost
pluginHost.returnData(demFile)

'''
Remove the depressions in the DEM using the 'FillDepressions' tool.

The help file for each tool in Whitebox contains a section detailing
the required input parameters needed to run the tool from a script.
These parameters are always fed to the tool in a String array, in
the case below, called 'args'. The tool is then run using the 'runPlugin'
method of the pluginHost. runPlugin takes the name of the tool (see
the tool's help for the proper name), the arguments string array,
followed by two Boolean arguments. The first of these Boolean
arguments determines whether the plugin will be run on its own
separate thread. In most scripting applications, this should be set
to 'False' because the results of this tool are needed as inputs to
subsequent tools. The second Boolean argument specifies whether the
data that are returned to the pluginHost after the tool is completed
should be suppressed. Many tools will automatically display images
or shapefiles or some text report when they've completed. It is often
the case in a workflow that you only want the final result to be
displayed, in which case all of the runPlugins should have this final
Boolean parameter set to 'True' except for the last operation, for
which it should be set to 'False' (i.e. don't suppress the output).
The data will still be written to disc if the output are supressed,
they simply won't be automatically displayed when the tool has
completed. If you don't specify this last Boolean parameter, the
output will be treated as normal.
'''
filledDEMFile = wd + &amp;quot;filled DEM.dep&amp;quot;
flatIncrement = &amp;quot;0.001&amp;quot; # Notice that although this is a numeric parameter, it is provided to the tool as a string.
args = [demFile, filledDEMFile, flatIncrement]
pluginHost.runPlugin(&amp;quot;FillDepressions&amp;quot;, args, False, True)

# Calculate the D8 pointer (flow direction) file.
pointerFile = wd + &amp;quot;pointer.dep&amp;quot;
args = [filledDEMFile, pointerFile]
pluginHost.runPlugin(&amp;quot;FlowPointerD8&amp;quot;, args, False, True)

# Perform the flow accumulation operation.
flowAccumFile = wd + &amp;quot;flow accumulation.dep&amp;quot;
outputType = &amp;quot;number of upslope grid cells&amp;quot;
logTransformOutput = &amp;quot;False&amp;quot;
args = [pointerFile, flowAccumFile, outputType, logTransformOutput]
pluginHost.runPlugin(&amp;quot;FlowAccumD8&amp;quot;, args, False, True)

# Extract the streams
streamsFile = wd + &amp;quot;streams.dep&amp;quot;
channelThreshold = &amp;quot;1000.0&amp;quot;
backValue = &amp;quot;NoData&amp;quot;
args = [flowAccumFile, streamsFile, channelThreshold, backValue]
pluginHost.runPlugin(&amp;quot;ExtractStreams&amp;quot;, args, False, False) # This final result will be displayed

'''
Note that in each of the examples above, I have created new variables
to hold each of the input parameters for the plugin tools. I've done
this more for clarity than anything else. The script could be
substantially shorted if the shorter variables were directly entered
into the args array. For instance, I could have easily used:

args = [flowAccumFile, streamsFile, &amp;quot;1000.0&amp;quot;, &amp;quot;NoData&amp;quot;]

for the last runPlugin and saved myself declaring the two variables.
Because the file names are generally used in subsequent operations,
it is a good idea to dedicate variables to those parameters.
'''
Whitebox Scripter

Whitebox Scripter

Edit
Catching Bugs Before They Bug You:

When you’re writing a script in Whitebox, if your program throws an error, the software will record the error in your log files. The log files are xml files contained within the logs directory within the main Whitebox folder. They are detailed printouts of exactly what was happening around the time that the exception was thrown. They can certainly be challenging to read. An easier way of dealing with this problem is to incorporate exception handling in your script. Here’s a brief example:

try:
	i = 14
	k = &quot;24&quot;
	j = i + k # Throws an error
except Exception, e:
	print e
	pluginHost.showFeedback(&quot;Error during script execution.&quot;)
	''' alternatively, you many want to send the exception to 
	the pluginHost.logException() method '''
finally:
	print &quot;I'm done!&quot;

This code will print the following statement to the Whitebox Scripter console:

TypeError(“unsupported operand type(s) for +: ‘int’ and ‘unicode'”,)

Getting the most out of Whitebox GAT

Have you been running into out-of-memory errors with Whitebox and are a bit confused because you have 8-16GB of memory in your computer? My guess is you’re running Microsoft Windows and the good news is there’s a solution to your problems. The default Java Runtime Environment (JRE) for download on the Oracle site for computers running Microsoft Windows OS is the 32-bit version even if you are running a 64-bit version of the operating system. This will severely limit the amount of memory that Whitebox can access. In fact, working with 32-bit Java will limit the memory that any Java program on your computer will see, but because Whitebox is used to process large spatial files, it’s particularly problematic for Whitebox users. The 32-bit version only allows Java to see about 1.1GB of your RAM which doesn’t give you much room to play with that wonderful satellite image or LiDAR file, does it? This can affect performance and can lead to out-of-memory errors when handling large files. Given the large number of Windows users, this affects many people (I’ll point out here that there is only a 64-bit version of Java for Mac OSX so this same problem simply doesn’t occur).

You can check what version of the JRE Whitebox is running under by selecting Options and Settings under the View menu.

Whitebox Options and Settings

Options and Settings in Whitebox

You can actually be running multiple versions of the Java runtime on your computer at the same time, so you really have to check this in Whitebox to find out which version it is running on. If you are running the 32-bit version on a 64-bit computer it would be advisable to upgrade your Java to the 64-bit version. On the Oracle Java download site, the default version of Java for Windows is 32-bit, so you’ll need to explicitly select the 64-bit version. You want the version that ends with -windows-x64.exe rather than the 32-bit version which ends with -windows-i586.exe. Now uninstall that old 32-bit version (make sure that no other programs require it first and also make sure that Whitebox is closed before you uninstall Java) and install the new 64-bit version. Launch Whitebox again and check under the Options and Settings menu to see that you have a reasonable heap size size and that you’re using 64-bit Java. You should find that you have considerably more room to play around with those massive files now. It’s like moving into a mansion after having lived for years in a one-bedroom apartment…what will you do with all that new room? I hope you use it to do some wonderfully exciting geospatial analysis 😉

Cheers,

John Lindsay