Whitebox GAT for ArcGIS users

A short time ago over on the Whitebox GAT listserv someone asked an excellent question. They asked whether there was such thing as a Whitebox GAT User’s Manual, since the transition to Whitebox GAT, when you’re coming from an ArcGIS background, can be a little difficult. Often there are corresponding tools in Whitebox GAT to those in ArcGIS but they will frequently (and sometimes frustratingly, I’d imagine) have different names. There are different workflows for doing similar tasks and in general you can feel a little like a fish out of water for a short while until you get use to the Whitebox GAT way of doing things.

I can certainly understand this perspective. To date, I’ve been the main contributor to the Whitebox GAT project and so it reflects my design decisions more than anything. And, let’s face it, I’m a little different. As I pointed out to the question asker, I have never in my past had a time when I have used ArcGIS on a day-to-day basis. I’ve used IDRISI extensively many years ago (and that fact is apparent in some aspects of Whitebox in the way certain things are done) but frankly I’m just not that familiar with ArcGIS. It doesn’t inform how I have developed Whitebox to a large extent.

While we’re still working on a draft of a comprehensive Whitebox GAT User’s Manual (thank you to Adam Bonnycastle for his efforts here), I have been busily compiling quick tutorials on common tasks in Whitebox, which I have been cleverly embedding in the Help menu:

Whitebox's Help menu

Whitebox’s Help menu

Notice, that list starting with ‘Getting the Whitebox source code’? I’m not sure to what extent this feature has been taken up, but it’s there. If you have any suggestions for other additions let me know either by emailing me or leaving a comment below. If you have any Whitebox GAT tutorials that you’ve created yourself and would like to donate to the project, also please let me know.

But this gets me to the main point of this blog. What I would like to do is offer a brief tutorial on the topic “Whitebox GAT for ArcGIS users“. It can be loosely formed as a list of common gottcha’s or ‘Them vs. Us’ comparisons. But as I’ve said above, I really don’t use ArcGIS, so I need to ask for your help. If you’re a Whitebox GAT user and are coming to it from an ArcGIS background, can you provide us with a list of the main difficulties you experienced when you first tried using Whitebox? We’re looking for some of the things that you had to spend time trying to figure out how to do. Wouldn’t it be great if you could help to make that transition a little easier for future Whitebox users? I’m a big believer in the idea of paying it forward. Leave your comments below and, as always, best wishes and happy geoprocessing.

Whitebox Geospatial Analysis Tools v. 3.2.1 Now Released!

I am very pleased to announce the release of version 3.2.1 of the free and open-source GIS Whitebox Geospatial Analysis Tools. Although I was not able to accomplish all of the development tasks that I had set out to do over the summer, this version does incorporate numerous improvements and additions (see below for details). Whitebox 3.2.1 requires Java 8u20 or higher, which can be downloaded from the Oracle site here. Please update your Java runtime before launching Whitebox. I hope that you enjoy the new version. As always, I welcome any feedback you may have, including suggestions for future features and tools and feedback on general usability. Bugs and other errors can be submitted using the ‘Report An Error‘ found within Whitebox’s Help menu. For questions regarding specific applications, please consider subscribing to the Whitebox Listserv which can be found at the following site:

http://www.lsoft.com/scripts/wl.exe?SL1=WHITEBOX-GAT&H=LISTSERV.UOGUELPH.CA

The following modifications have been made:

  • Added the Retrieve SRTM DEM Data tool. This tool will download SRTM DEM tiles from the USGS FTP site, import the tiles into Whitebox GAT, and optionally fill missing data holes and mosaic the tiles.
  • Added the ability to natively display LAS file point clouds in the map area.
  • Added Hex-binning tool for producing hexagonally gridded heat-maps (this is pretty cool for data visualization!).
  • Added a tool for isolating ground return points from LiDAR LAS files. I plan on adding additional methods for performing this type of LiDAR point filtering in the near future.
  • Added a tool for separating large LiDAR LAS files into smaller tiles of shapefile points.
  • Added a Lee filter (Sigma filter) tool.
  • Added Interior Point tool.
  • Added CreateRectangularGridTool for creating a vector grid of polygons based on
    a rectangular grid.
  • Added Single-part to Multi-part and Multi-part to Single-part tools.
  • Added List Unique Values tool to list all of the unique values in a categorical
    field contained within a vector’s attribute table. It will also report category
    frequency.
  • Added a fast depression breaching tool. Depression breaching is a far superior
    method for pre-processing a DEM for hydrological analysis (e.g. mapping
    watersheds). One of the reasons that people continue to fill depressions instead of breaching is that filling is computationally much more efficient than
    breaching. Well this tool is just about as fast as Whitebox’s depression filling
    tool. So now you have no excuse. This should be your default tool for hydrological processing of DEMs. Its result is not as good as the optimal breaching tool but it is far more efficient and still much better than filling. Please breach!
  • Added the tool BurnStreamAtRoads, which will carve a stream path through road embankments at the site of road/stream crossings.
  • Added a Minimum Interpolation tool for shapefile point inputs similar to the
    Minimum Interpolation (LiDAR) tool.
  • Added the ability to update individual scripts from the source code repository.
    This way, if you modify the code for a tool and break it, you can always
    revert it back to the source on the main trunk. There’s also now a menu entry
    that will update all scripts for which there are newer versions within the
    code repository and new scripts.
  • Added a Flood Order tool; details in tool’s help.
  • Added a polygonize tool for converting polylines into polygons.
  • Updated the centroid (vector) tool with the option to handle multi-part polyline
    and polygon features as a single entity or to extract centroids for each part,
    as well as, the ability to extract centroids for groups of points.
  • Created a convenience tool, Feature Selection, for opening the feature selection tab within the attribute table dialog after selecting a vector layer. If the layer is not currently displayed, it will be. The reason I added this tool is
    because many people search for feature selection in the toolbox rather than
    in the attribute table dialog.
  • The Hillshade tool will now automatically calculate an appropriate z conversion value when it detects that the DEM is in geographic coordinates with XY units of degrees. I also fixed the Slope, Aspect, and all the Curvature tools to do this as well.

Workflow Automation Part 2

In my earlier post on workflow automation in Whitebox, Simon Seibert left a comment asking, “I would like to know if it possible to include for loops in the Scripter as well? I would like to run the same script over many files. Could you also provide an example for such a problem?” Well Simon, yes you can. Here is an example of a simple Python script that finds all of the raster files contained within the current working directory and then performs a very simple analysis on each file:

import os
# The following code will find each raster
# file (.dep) in the working directory and
# then run a mean filter on the image.
wd = pluginHost.getWorkingDirectory()
a = 1
for file in os.listdir(wd):
  if file.endswith(".dep"):
    inputFile = wd + file
    outputFile = wd +"output" + str(a) + ".dep"
    a += 1
    xDim = "3"
    yDim = "3"
    rounded = "false"
    reflectEdges = "true"
    args = [inputFile, outputFile, xDim, yDim, rounded, reflectEdges]
    pluginHost.runPlugin("FilterMean", args, False)

If you want to take it to the next level, you can parallelize the script so that each iteration is run on a separate thread. So, there you have it. Leave your comments below and, as always, best wishes and happy geoprocessing.

Hexbinning in Whitebox GAT

The practice of binning point data to form a type of 2D histogram, density plot, or what is sometimes called a heatmap, is quite useful as an alternative for the cartographic display of of very dense points sets. This is particularly the case when the points experience significant overlap at the displayed scale. This is the type of operation that is used (in feature space) in Whitebox‘s Feature Space Plot tool. Normally when we perform binning, however, it is using a regular square or perhaps rectangular grid. The other day I read an interesting blog on hexbinning (see also the excellent post by Zachary Forest Johnson also on the topic of hexbinning), or the use of a hexagonal-shaped grid for the creation of density plots.

A Hex-binned heatmap

A Hex-binned heatmap (click to enlarge)

These blogs got me excited, in part because I have used hex-grids for various spatial operations in the past and am aware of several advantages to this tessellation structure. The linked hexbinning blogs point out that hexagonal grids are useful because the hexagon is the most circular-shaped polygon that tessellates. A consequence of this circularity is that hex grids tend to represent curves more naturally than square grids, which includes better representation of polygon boundaries during rasterization. But there are several other advantages that make hexbinning a worthwhile practice. For example, one consequence of the nearly circular shape of hexagons is that hex-grids are very closely packed compared with square grids. That is, the average spacing between hexagon centre points in a hex-grid is smaller than the equivalent average spacing in a square grid. One way of thinking about this characteristic is that it means that hexagonal grids require about 13% fewer data points then the square grid to represent a distribution at a comparable level of detail. Also, unlike a square grid, each cell in a hex-grid shares an equal-length boundary with its six neighbouring grid cells. With square grids, four of the eight neighbouring cells are connected through a point only. This causes all sorts of problems for spatial analysis, not the least of which is the characteristic orientation sensitivity of square grids; and don’t get me started on the effects of this characteristics for surface flow-path modelling on raster digital elevation models. Hex-grid cells are also equally distant to each of their six neighbours. I’ve even heard it argued before that given the shape of those cone and rod cells in our eyes, hex-grids are more naturally suited to the human visual system, although I’m not sure how true this is.

Hexagonal grids are certainly worthwhile data structures and hex-binning is a great way to make a heatmap. So, that said, I decided to write a tool to perform hex-binning in Whitebox GAT. The tool will be publicly available in the 3.2.1 release of the software, which will hopefully be out soon but here is a preview:

Whitebox's Hexbinning tool

Whitebox’s Hexbinning tool (click to enlarge)

The tool will take either an input shapefile of POINT or MULTIPOINT ShapeType, or a LiDAR LAS file and it will be housed in both the Vector Tools and LiDAR Tools toolboxes. Here is an example of a hex-binned density map (seen on right) derived using the tool applied to a distribution of 7.7 million points (seen on left) contained in a LAS file and derived using a terrestrial LiDAR system:

Hexbinned LiDAR points

Hexbinned LiDAR points (Click to enlarge)

Notice that the points in the image on the left are so incredibly dense in places that you cannot effectively see individual features; they overlap completely to form blanket areas of points. It wouldn’t matter how small I rendered the points, at the scale of the map, they would always coalesce into areas. The hex-binned heatmap is a far more effective way of visualizing the distribution of LiDAR points in this case.

The hexbinning tool is also very efficient. It took about two seconds to perform the binning on the 7.7 million LiDAR points above using my laptop. In the CUNY blog linked above, the author (I think it was Lee Hachadoorian) describes several problems that they ran into when they performed the hexbinning operation on their 285,000 points using QGIS. I suspect that their problems were at least partly the result of the fact that they performed the binning using point-in-polygon operations to identify the hexagon in which each point plotted. Point-in-polygon operations are computationally expensive and I think there is a better way here. A hex-grid is essentially the Voronoi diagram for the set of hexagon centre points, meaning that every location within a hexagon grid cell is nearest the hexagon’s centre than another other neighbouring cell centre point. As such, a more efficient way of performing hexbinning would be to enter each hexagon’s centre point into a KD-tree structure and perform a highly efficient nearest-neighbour operation on each of the data points to identify their enclosing hexagon. This is a fast operation in Whitebox and so the hexbinning tool works well even with massive point sets, as you commonly encounter when working with LiDAR data and other types of spatial data.

So I hope you have fun with this new tool and make some beautiful maps. Leave your comments below and, as always, best wishes and happy geoprocessing.

John Lindsay

Creating beautiful relief models

Recently I stumbled across a couple of very interesting questions over on the GIS Stack Exchange. One of them was about the use of appropriate colour palettes for displaying DEMs, and the other was about colour palettes for hillshade images. This last one in particular got me thinking a lot about effective cartographic display of terrain and introduced me to the work of the famous Swiss cartographer Eduard Imhof. Why is it that every hillshade image that I have ever created is displayed in a greyscale palette?

Greyscale hillshade of the Banff area.

Greyscale hillshade of the Banff area. (Click to enlarge)

I’ve always thought that hillshading is a very effective way of visualizing topographic information, but greyscale hillshade images by themselves are so boring! Of course the answer is that I don’t normally just display a hillshade image. Instead, the hillshade image is displayed transparently over top a DEM, with colour used to relay elevation information and tone used for illumination. This type of composite relief model is very effective for visualizing terrain. But what about using other colours for illumination other than simply greyscale hillshading, like Imhof did so long ago? To do this effectively in Whitebox, you may want to create a custom palette using the Palette Manager (under the Tools menu), one of the most under-used tools in Whitebox:

The Palette Manager tool in Whitebox GAT

The Palette Manager tool in Whitebox GAT

So here is a hillshade image displayed using the custom palette above [simply blending RGB(0, 50, 100) to RGB(255, 240, 170)]:

Imhof inspired hillshade

Imhof inspired hillshade. (Click to enlarge)

I think that’s beautiful. It’s like bathing the landscape in warm, bright sunlight. Of course, we can actually take this a step further. My father is an artist and retired art teacher. I remember when I was growing up, he taught me that the atmosphere is actually coloured when you have enough air thickness; it’s a little blue. That’s why in this painting of his, one of my favourites, you can see that the hills in the distance become slightly bluer:

Notice the blue ting for distant mountains

Notice the blue tinge for distant mountains (Powassan landscape by James M. Lindsay)

I now know that this phenomenon is called Rayleigh scattering (the reason that the sky is blue), and koodos to my Dad for inherently understanding this phenomenon. So since the height of the air column and density of the atmosphere are greater in the valley bottoms, there should be a ‘blue tinge’ in the valleys. We can create this ‘blue tinge’ by transparently overlaying the DEM, displayed in a light-blue-to-white palette (again created using the Palette Manager), and get this:

Final hillshade image

Final hillshade image. (Click to enlarge)

You can play around with the palettes and the levels of transparency (and display minimum and maximum values) until you get things just the way that you want. It’s almost artistic! I think we can all agree however that this final hillshade is a much more visually pleasing and a more effective display of terrain compared to that initial greyscale image. It looks less like a photo taken of the harsh lunar landscape and more like an earthscape. So have some fun with it and don’t always except the default values of things. And terrain displayed in this way lends itself well to overlaying other information:

Columbia Icefield map. (Click to enlarge)

Columbia Icefield map. (Click to enlarge)

Leave your comments below and, as always, best wishes and happy geoprocessing.

John

SRTM DEM data downloader in Whitebox GAT

I just finished developing a tool for Whitebox GAT that will automatically download Shuttle Radar Topography Mission (SRTM) digital elevation models (DEMs) from the USGS SRTM FTP site. SRTM-3 data are among the best global elevation data, with a grid resolution of approximately 90 m. In many areas SRTM data provide the only topographic data set available. Within the United States, the SRTM-1 dataset provides an improved 30 m resolution. Not only does this Whitebox tool retrieve the SRTM tiles contained within the bounding box of a specified area of interest, but it will also import the tiles to Whitebox GAT, fill missing data holes (which are common with SRTM data in rugged terrain) and mosaic the tiles.

Whitebox's new Retrieve SRTM Data tool

Whitebox’s new Retrieve SRTM Data tool

There have been many times in the past when I have needed to download numerous SRTM files, import the files, fill the missing data holes, and finally mosaic the individual tiles. It can be a laborious workflow indeed. This tool will save me a great deal of time, and so I’m rather excited about it. It’s as though the data magically appear in Whitebox!

SRTM data in Whitebox GAT

SRTM data in Whitebox GAT

Of course, with Whitebox GAT’s extensive Terrain Analysis and Hydrological Analysis toolboxes, there’s plenty of interesting things that you can do once those data do magically appear.

90 m DEM of the entire British Isles

90 m DEM of the entire British Isles. (Note that the image is of a coarser resolution than the actual DEM.)

As an example, I created the SRTM-3 90 m DEM of the entire British Isles shown above in 4 minutes, 56.27 seconds, including the time to download 91 individual SRTM tiles, fill their missing data gaps, and mosaic the tiles. My son was even watching Netflix during my downloading, so I can only imagine how much that slowed things down! I only wish that other data providers could follow a similar data sharing model as the USGS and use an anonymous FTP server to distribute their data. If that were the case, we could have many other data sets automatically ingested directly into Whitebox. I’ll release this new SRTM retrieval tool in the next public release of Whitebox GAT (v 3.2.1), which will likely be sometime later this summer. If you are as keen to try it out as I am, email me for a preview copy. If you have any other comments or feedback, please leave them in the comments section below. As always, best wishes and happy geoprocessing.

John Lindsay

*****EDIT*****

Version 3.2.1 of Whitebox has now been released, with the SRTM downloader tool embedded in the Data Layers menu. Please let me know if you have any issues in using the tool.

****EDIT****

I’ve included an image of the DEM of Ireland for John below:

SRTM DEM of Ireland

SRTM DEM of Ireland (Click to enlarge)

So, what exactly is ‘open-access’ software?

There’s no doubt that Whitebox Geospatial Analysis Tools is open-source software. It developed under a free and open-source (FOSS) licence called the GNU General Public Licence and it’s source code is publicly available and modifiable. But I often say that Whitebox GAT is an example of open-access software. So what exactly do I mean by the term open-access software? That’s a good question since, as far as I know, I made the term up. Actually, open-access is a fairly common idea these days and has largely developed out of a perceived need for greater public availability to the outputs of academic publishing. The concept of open access is defined in the statement of the Budapest Open Access Initiative (Chan et al., 2002) as the publication of scholarly literature in a format that removes financial, legal, and technical access barriers to knowledge transfer. Although this definition of open access focuses solely on the publication of research literature, I would argue that the stated goals of reducing barriers associated with knowledge transfer can be equally applied to software. In fact, many of the goals of open-access stated in the definition above are realized by open-source software. Therefore, open-access software can be viewed as a complimentary extension to the traditional open-source model of software development.

The idea of open-access software is that the user community as a whole benefits from the ability of individual users to examine the internal workings of the software. In the case of geospatial software, e.g. a GIS or remote sensing software, this is likely to relate to specific algorithms associated with various analysis tools. Cȃmara and Fonseca (2007) also recognized that adoption of open-source software is not only a choice of software, but also a means of acquiring knowledge. Direct insight into the workings of algorithm design and implementation allows for educational opportunities and a deeper level of knowledge transfer, as well as the potential for rapid innovation, software improvements, and community-directed development. I would argue that this is particularly important in the GIS field because many geospatial algorithms are highly complex and are impacted by implementation details. There are often multiple competing algorithms for accomplishing the same task and the choice of one method over another can greatly impact the outcome of a spatial analysis operation. For example, consider how many different methods there are to measure the pattern of slope gradient from a digital elevation model or the numerous flow routing algorithms for interrogating overland flow paths. This is likely one of the reasons that open-source GIS packages have become so widely used in recent years.

So the benefits of an engaged user community with the ability to inspect software source code are numerous and profound, but aren’t these benefits realized by all FOSS GIS? As with anything worth looking into deeply, the answer is probably more complex than it initially appears. It’s true that all FOSS allow users the opportunity to download source code and to inspect the internal workings. This is in contrast to proprietary software for which the user can only gain insight into the workings of a tool from the provided help documentation. But the traditional method of developing FOSS doesn’t lend itself to end-user code inspection.

The concept of open-access software is based on the idea that software should be designed in a way that reduces the barriers that often discourage or disallow end-users from examining the algorithm design and implementation associated with the source code of specific software artefacts, e.g. geospatial tools in a GIS. That is, open-access software encourages the educational opportunities gained by direct inspection of code. It is important to understand that I am referring to barriers encountered by the typical end-user that may be interested in more completely understanding the details of how a specific tool works; I’m not considering the barriers encountered by the developers of the software…that’s a different topic that I’ll leave for another day. Think about how often you’ve used a GIS and wondered how some tool or operation functions after you pressed the OK button on the tool’s dialog. Even if it was an open-source GIS, you probably didn’t go any further than reading the help documentation to satisfy your curiosity. Why is that? It likely reflects a set of barriers that discourages user engagement and is inherent in the typical implementation of the open-source software model (and certainly the proprietary model too). An open-access software model, however, states that the reduction of these barriers should be a primary design goal that is taken into account at the inception of the project.

The main barriers that restrict the typical user of an open-source GIS from engaging with the code include each of the following:

  1. The need to download source code from a project repository that is separate from the main software artefact (i.e. the executable file). Often the project source-code files are larger than the actual program and downloading such a large file can be challenging for people with limited access to the internet.
  2. The need to download and install a specialized software program, called an integrated development environment (IDE), often required to open and view a project’s source code files. A typical GIS end-user who may find themselves interested in how a particular tool works is less likely to install this additional software, presenting yet another barrier between the user and the source code. 
  3. The required familiarity with the software project’s organizational structure needed to navigate the project files to locate the code related to a specific tool. That is, an understanding of the organization of the source code is necessary to identify the code associated with a specific tool or algorithm of interest. Most desktop GIS projects consist of hundreds of thousands of lines of computer code that are contained within many hundred files. Large projects possess complex organizational structures that are only familiar to the core group of developers. The level of familiarity with a project’s organization that is needed to navigate to the code associated with a particular feature or tool presents a significant barrier to the casual end-user who may be interested in gaining a more in-depth understanding of how a specific feature operates.
  4. The required ability to read code written in a specific programming language.  

Each of the barriers described above impose significant obstacles for users of open-source GIS that discourage deeper probing into how operations function. There may be other barriers, but those listed above are certainly among the most significant. Whitebox GAT attempts to address some these issues by allowing users to view the computer code associated with each plug-in directly from the tool’s dialog. Thus, just as a detailed description of a tool’s working is provided in the help documentation, which appears within the tool’s dialog, so to can the user choose to view the actual algorithm implementation simply by selecting the View Code button on the dialog.

The Clump tool dialog. Notice the View Code button common to all tool dialogs.

The Clump tool dialog. Notice the View Code button common to all tool dialogs. Click on image for enlarged version.

This removes the need to download separate, and often large, project source code files and it eliminates the requisite familiarity with the project to identify the lines of code related to the operation of the tool of interest. Furthermore the tool’s code will appear within an embedded window that provides syntax highlighting to enhance the viewer’s ability to interpret the information. The View Code button is so much more than a quirk of Whitebox GAT; it’s the embodiment of a design philosophy that empowers the software’s user community. This model has the potential to encourage further community involvement and feedback. Among the group of users that are comfortable with GIS programming and development, the ability to readily view the code associated with a tool can allow rapid transfer of knowledge and best-practices for enhancing performance. This model also encourages more rapid development because new functionality can be added simply by modifying existing code. The 1.0 series of Whitebox, developed using the .NET framework, even had the ability to automatically translate code written in one programming language into several other languages, thereby increasing the potential for knowledge transfer and lessening Barrier 4 above. Unfortunately this feature could not be replicated when the project migrated to the Java platform although there are on-going efforts to implement a similar feature.

So, that’s what I mean by open-access GIS. I think that it is a novel concept with the potential to significantly enhance the area of open-source software development in a way that will benefit the whole user community. So when people ask me why I bothered to write my own GIS when I could simply have contributed to one of the many successful and interesting open-source GIS projects that are out there, my reply is usually centred around the need for an open-access GIS. Some would say that I am an idealist, but oddly, I tend to think of myself as a pragmatist. In any case, the world could benefit from more idealists, don’t you think? If you have comments, suggestions, or feedback please leave them in the comments section below. And, as always, best wishes and happy geoprocessing.

Note: this blog is based on sections of a presentation that I gave at GISRUK 2014 and a manuscript that I am preparing for publication on the topic.