SRTM DEM data downloader in Whitebox GAT

I just finished developing a tool for Whitebox GAT that will automatically download Shuttle Radar Topography Mission (SRTM) digital elevation models (DEMs) from the USGS SRTM FTP site. SRTM-3 data are among the best global elevation data, with a grid resolution of approximately 90 m. In many areas SRTM data provide the only topographic data set available. Within the United States, the SRTM-1 dataset provides an improved 30 m resolution. Not only does this Whitebox tool retrieve the SRTM tiles contained within the bounding box of a specified area of interest, but it will also import the tiles to Whitebox GAT, fill missing data holes (which are common with SRTM data in rugged terrain) and mosaic the tiles.

Whitebox's new Retrieve SRTM Data tool

Whitebox’s new Retrieve SRTM Data tool

There have been many times in the past when I have needed to download numerous SRTM files, import the files, fill the missing data holes, and finally mosaic the individual tiles. It can be a laborious workflow indeed. This tool will save me a great deal of time, and so I’m rather excited about it. It’s as though the data magically appear in Whitebox!

SRTM data in Whitebox GAT

SRTM data in Whitebox GAT

Of course, with Whitebox GAT’s extensive Terrain Analysis and Hydrological Analysis toolboxes, there’s plenty of interesting things that you can do once those data do magically appear.

90 m DEM of the entire British Isles

90 m DEM of the entire British Isles. (Note that the image is of a coarser resolution than the actual DEM.)

As an example, I created the SRTM-3 90 m DEM of the entire British Isles shown above in 4 minutes, 56.27 seconds, including the time to download 91 individual SRTM tiles, fill their missing data gaps, and mosaic the tiles. My son was even watching Netflix during my downloading, so I can only imagine how much that slowed things down! I only wish that other data providers could follow a similar data sharing model as the USGS and use an anonymous FTP server to distribute their data. If that were the case, we could have many other data sets automatically ingested directly into Whitebox. I’ll release this new SRTM retrieval tool in the next public release of Whitebox GAT (v 3.2.1), which will likely be sometime later this summer. If you are as keen to try it out as I am, email me for a preview copy. If you have any other comments or feedback, please leave them in the comments section below. As always, best wishes and happy geoprocessing.

John Lindsay

*****EDIT*****

Version 3.2.1 of Whitebox has now been released, with the SRTM downloader tool embedded in the Data Layers menu. Please let me know if you have any issues in using the tool.

****EDIT****

I’ve included an image of the DEM of Ireland for John below:

SRTM DEM of Ireland

SRTM DEM of Ireland (Click to enlarge)

Advertisements

So, what exactly is ‘open-access’ software?

There’s no doubt that Whitebox Geospatial Analysis Tools is open-source software. It developed under a free and open-source (FOSS) licence called the GNU General Public Licence and it’s source code is publicly available and modifiable. But I often say that Whitebox GAT is an example of open-access software. So what exactly do I mean by the term open-access software? That’s a good question since, as far as I know, I made the term up. Actually, open-access is a fairly common idea these days and has largely developed out of a perceived need for greater public availability to the outputs of academic publishing. The concept of open access is defined in the statement of the Budapest Open Access Initiative (Chan et al., 2002) as the publication of scholarly literature in a format that removes financial, legal, and technical access barriers to knowledge transfer. Although this definition of open access focuses solely on the publication of research literature, I would argue that the stated goals of reducing barriers associated with knowledge transfer can be equally applied to software. In fact, many of the goals of open-access stated in the definition above are realized by open-source software. Therefore, open-access software can be viewed as a complimentary extension to the traditional open-source model of software development.

The idea of open-access software is that the user community as a whole benefits from the ability of individual users to examine the internal workings of the software. In the case of geospatial software, e.g. a GIS or remote sensing software, this is likely to relate to specific algorithms associated with various analysis tools. Cȃmara and Fonseca (2007) also recognized that adoption of open-source software is not only a choice of software, but also a means of acquiring knowledge. Direct insight into the workings of algorithm design and implementation allows for educational opportunities and a deeper level of knowledge transfer, as well as the potential for rapid innovation, software improvements, and community-directed development. I would argue that this is particularly important in the GIS field because many geospatial algorithms are highly complex and are impacted by implementation details. There are often multiple competing algorithms for accomplishing the same task and the choice of one method over another can greatly impact the outcome of a spatial analysis operation. For example, consider how many different methods there are to measure the pattern of slope gradient from a digital elevation model or the numerous flow routing algorithms for interrogating overland flow paths. This is likely one of the reasons that open-source GIS packages have become so widely used in recent years.

So the benefits of an engaged user community with the ability to inspect software source code are numerous and profound, but aren’t these benefits realized by all FOSS GIS? As with anything worth looking into deeply, the answer is probably more complex than it initially appears. It’s true that all FOSS allow users the opportunity to download source code and to inspect the internal workings. This is in contrast to proprietary software for which the user can only gain insight into the workings of a tool from the provided help documentation. But the traditional method of developing FOSS doesn’t lend itself to end-user code inspection.

The concept of open-access software is based on the idea that software should be designed in a way that reduces the barriers that often discourage or disallow end-users from examining the algorithm design and implementation associated with the source code of specific software artefacts, e.g. geospatial tools in a GIS. That is, open-access software encourages the educational opportunities gained by direct inspection of code. It is important to understand that I am referring to barriers encountered by the typical end-user that may be interested in more completely understanding the details of how a specific tool works; I’m not considering the barriers encountered by the developers of the software…that’s a different topic that I’ll leave for another day. Think about how often you’ve used a GIS and wondered how some tool or operation functions after you pressed the OK button on the tool’s dialog. Even if it was an open-source GIS, you probably didn’t go any further than reading the help documentation to satisfy your curiosity. Why is that? It likely reflects a set of barriers that discourages user engagement and is inherent in the typical implementation of the open-source software model (and certainly the proprietary model too). An open-access software model, however, states that the reduction of these barriers should be a primary design goal that is taken into account at the inception of the project.

The main barriers that restrict the typical user of an open-source GIS from engaging with the code include each of the following:

  1. The need to download source code from a project repository that is separate from the main software artefact (i.e. the executable file). Often the project source-code files are larger than the actual program and downloading such a large file can be challenging for people with limited access to the internet.
  2. The need to download and install a specialized software program, called an integrated development environment (IDE), often required to open and view a project’s source code files. A typical GIS end-user who may find themselves interested in how a particular tool works is less likely to install this additional software, presenting yet another barrier between the user and the source code. 
  3. The required familiarity with the software project’s organizational structure needed to navigate the project files to locate the code related to a specific tool. That is, an understanding of the organization of the source code is necessary to identify the code associated with a specific tool or algorithm of interest. Most desktop GIS projects consist of hundreds of thousands of lines of computer code that are contained within many hundred files. Large projects possess complex organizational structures that are only familiar to the core group of developers. The level of familiarity with a project’s organization that is needed to navigate to the code associated with a particular feature or tool presents a significant barrier to the casual end-user who may be interested in gaining a more in-depth understanding of how a specific feature operates.
  4. The required ability to read code written in a specific programming language.  

Each of the barriers described above impose significant obstacles for users of open-source GIS that discourage deeper probing into how operations function. There may be other barriers, but those listed above are certainly among the most significant. Whitebox GAT attempts to address some these issues by allowing users to view the computer code associated with each plug-in directly from the tool’s dialog. Thus, just as a detailed description of a tool’s working is provided in the help documentation, which appears within the tool’s dialog, so to can the user choose to view the actual algorithm implementation simply by selecting the View Code button on the dialog.

The Clump tool dialog. Notice the View Code button common to all tool dialogs.

The Clump tool dialog. Notice the View Code button common to all tool dialogs. Click on image for enlarged version.

This removes the need to download separate, and often large, project source code files and it eliminates the requisite familiarity with the project to identify the lines of code related to the operation of the tool of interest. Furthermore the tool’s code will appear within an embedded window that provides syntax highlighting to enhance the viewer’s ability to interpret the information. The View Code button is so much more than a quirk of Whitebox GAT; it’s the embodiment of a design philosophy that empowers the software’s user community. This model has the potential to encourage further community involvement and feedback. Among the group of users that are comfortable with GIS programming and development, the ability to readily view the code associated with a tool can allow rapid transfer of knowledge and best-practices for enhancing performance. This model also encourages more rapid development because new functionality can be added simply by modifying existing code. The 1.0 series of Whitebox, developed using the .NET framework, even had the ability to automatically translate code written in one programming language into several other languages, thereby increasing the potential for knowledge transfer and lessening Barrier 4 above. Unfortunately this feature could not be replicated when the project migrated to the Java platform although there are on-going efforts to implement a similar feature.

So, that’s what I mean by open-access GIS. I think that it is a novel concept with the potential to significantly enhance the area of open-source software development in a way that will benefit the whole user community. So when people ask me why I bothered to write my own GIS when I could simply have contributed to one of the many successful and interesting open-source GIS projects that are out there, my reply is usually centred around the need for an open-access GIS. Some would say that I am an idealist, but oddly, I tend to think of myself as a pragmatist. In any case, the world could benefit from more idealists, don’t you think? If you have comments, suggestions, or feedback please leave them in the comments section below. And, as always, best wishes and happy geoprocessing.

Note: this blog is based on sections of a presentation that I gave at GISRUK 2014 and a manuscript that I am preparing for publication on the topic.

Mapping Watersheds in Whitebox GAT

Two of the most common activities involving digital elevation models (DEMs) include extracting stream networks and mapping watersheds. A watershed, often referred to as a drainage basin, is the land area that drains surface waters to a particular location, or outlet, in the landscape. I frequently get asked how to map watersheds using Whitebox GAT and so I thought that I would cover that topic in a blog. The Watershed tool, located in the Hydrological Tools toolbox, is used to map the areas draining to one or more outlets.

Whitebox's Watershed Tool

Whitebox’s Watershed Tool (click for full size)

The Watershed tool requires two input files. The first input is a flow pointer raster derived using the D8 flow algorithm. The D8 flow pointer raster is a remarkably useful grid and, although it isn’t much to look at, it is used as a main input for dozens of tools in Whitebox GAT. That’s because it can be used to determine the local drainage direction network, that is, the tree graph that connects every grid cell in a DEM to the upslope cells that drain to each cell and the downslope flow path that each cell drains to. This grid must be created using the D8, or steepest descent, flow algorithm and it must be created from a DEM that has been pre-processed to remove all artefact topographic depressions, i.e. areas of internal drainage. This is usually achieved by either filling or breaching (channelling) the depressions in the DEM. The D8 flow pointer raster is actually quite fast to create from a hydrologically corrected DEM, so you might wonder why Whitebox’s Watershed tool doesn’t just have you input the DEM and then it calculates the D8 pointer internally, saving you the hassle. After all, you’re unlikely to ever want to display the pointer raster. The answer is that a common terrain analysis workflow may involve numerous analyses (stream network extraction, stream network analysis, watershed extraction, etc.) that each require this pointer grid. They could all start off by calculating the pointer grid from your DEM but that would be terribly redundant. And since each workflow is unique this approach is the most flexible.

Now then, the second input to the Watershed tool is a pour point, or outlet, file. This can be either a raster grid or a vector points file (shapefile). If the D8 flow pointer is used to figure out how each grid cell in a raster DEM is connected to the network of overland flow paths, then the pour point file tells the tool which points in the network to extract the drainage basin, i.e. all of the upslope grid cells that drain to a particular pour point. If the pour points file is a shapefile containing multiple points, a watershed will be extracted for each of these points, with a unique identifier (usually the FID) assigned to each watershed. If the pour points file is a raster, watersheds will be extracted for all non-zero positive valued grid cells and the watershed identifier will be the same as the pour point values.

So, where are you going to get these pour points? Chances are that if you’re extracting watersheds then you likely have some points of interest in mind for which you’d like to map the drainage basin. These are usually points along the stream network. Sometimes they’re locations where you have data collected, for example a hydrometric station where you have information about stream discharge and water quality. You may even have the GPS coordinates for the points that you’ve imported into Whitebox GAT (see the Import CSV and Import XYZ tools). If this isn’t the case, then you’re probably going to want to digitize pour points using Whitebox’s on-screen digitizing capabilities. Take a look at the tutorial in the Whitebox help documentation called, How to digitize new vectors, for detailed instructions on how to do this. To summarize this process briefly, the steps involved in doing this are as follows. First, create either a flow accumulation grid using the D8 flow accumulation tool and display the raster (a vector streams layer can also be useful for this). Larger values in the flow accumulation grid will coincide with valley bottoms and streams channels. Second, create a new vector points file (Create New Shapefile tool). Zooming into the areas of interest, use the functions of on-screen digitizing tools to digitize individual pour points.

On-screen digitizing (click on image for full size)

On-screen digitizing (click on image for full size)

Okay, here’s an important point, whether you’ve imported your pour points from GPS coordinates gathered in the field or digitized them on-screen, it is quite unlikely that the your pour points coincide with the digital stream. Even if you used a surveyor-grade, highly precise GPS to get coordinates down to the nearest millimetre and you know for certain that the points are on the actual stream, it really only matters if the points lie on the path of the digital stream, i.e. the stream within the overland flow network in the DEM. Otherwise, if one or more of the pour points fall off of the digital stream even by one grid cell, you’ll extract a ‘stub watershed’ (I think I just made this term up, but it’s a good one!). The image below is an example of this. The pour point was clearly intended to be located at the outlet to the smaller basin just above the confluence (where stream tributaries join). Whether or not it is on the actual stream, or even the mapped stream, if it falls off of the path of the digital flow path, the mapped watershed will not be the intended watershed.

A stub watershed outlined in pale green for the red pour point.

A stub watershed outlined in pale green for the red pour point.

To fix this particular problem, you need to relocate your pour points so that they are coincident with the digital stream, denoted either by the DEM extracted stream network or the flow accumulation raster from which the extracted streams are derived. This task is what in ArcGIS terminology is known as Snap Pour Points, so named by the tool used to perform the operation, which is supposed to snap the pour point onto the digital stream. Whitebox GAT also has a Snap Pour Points tool, which works the same way. You give the tool a file containing one or more pour points and a flow accumulation raster, and the tool will reposition each outlet to coincide with the grid cell with the highest flow accumulation value within a circular neighbourhood of a specified radius. Here’s the thing…don’t use this tool! It’s effect can be profoundly awful, particularly when you have pour points located at the outlets of sub-basins in a larger network, which is often the case. Take the example in the ‘stub-watershed’ figure above. The pour point is intended to be located above the confluence at the outlet of the smaller tributary. Snap Pour Points will relocated it below the confluence (the highest flow accumulation value) and the extracted watershed will not only include the smaller sub-basin but also will include the main stream trunk as well…it will be many times larger than intended. Instead, I recommend using the Jenson Snap Pour Points tool, which will reposition each pour point to the nearest stream cell, which is much more likely to coincide with the position that you intended and less likely to result in over-sized watersheds. I wrote a paper on this topic that if you’re interested in you can contact me and I’ll forward to you.

So that’s it, after you have your D8 flow pointer, extracted from your hydrologically corrected DEM, and a file containing one or more properly positioned pour points, you can run the Watershed tool and you’ll get an output raster containing the drainage basins which direct surface waters to each of your specified outlets. You may want to use the Raster To Vector Polygons tool to turn the watersheds into a vector layer for map display as a last step. Oh, and by the way, if you have an extracted stream network you can extract all sorts of interesting watershed related layers like sub-basins, hillslopes, and Strahler order basins. Watershed mapping can reveal important information about drainage within a landscape and is a very powerful form of spatial analysis. If you have any questions, post them to the comments section below. And, as always, best wishes and happy geoprocessing.

John Lindsay

Watersheds

Watersheds

Whitebox GAT v. 3.2.0

I am very pleased to announce the release of the open-source GIS Whitebox Geospatial Analysis Tools 3.2.0 (http://www.uoguelph.ca/~hydrogeo/Whitebox/index.html). Whitebox GAT now contains over 370 tools for performing advanced geospatial analysis.  While this version includes several new tools and minor bug fixes (see below), the main change has been upgrading to the latest major release of Java, version 8. This upgrade offers several behind-the-scenes advantages in terms of programming. However, the main advantage is that the migration to Java 8 allows for the eventual update of the Whitebox user interface to the more modern JavaFX library instead of the more dated Swing user interface library that is currently being used. This migration to JavaFX will take some time, but this version release of Whitebox GAT paves the way forward. Importantly, this version targets Java 8 and therefore will require users to update their Java Runtime Environment. This release is unlikely to work correctly if you are still running Java 7. Other notable changes in this new release include:

  • Added a Cluster Attributes tool that performs k-means clustering on a selected group of attributes associated with a vector file.
  • Added zoom to selection for vector layers.
  • Added a Clip tool for vector clipping operations. I know, why has it taken this long? The Clip tool runs concurrently, taking advantage of all those extra cores on your processor, so it is quite fast even with large datasets.
  • Added floating-point line thicknesses and changed the default line thickness to 0.75. This has greatly improved the cartographic output of Whitebox GAT in my opinion.
  • Added Export Table to CSV tool.
  • Added Hack Stream Ordering and Topological Stream Ordering tools.
  • Added Total Length of Upstream Channels tool.
  • Added Furtherest Upstream Channel Head Distance tool.
  • Added the Attribute Histogram tool for create histograms based on numeric data contained in a shapefile’s attribute table as well as the Attribute Scattergram tool.
  • Vectors can now be displayed with a palette rendered based on a boolean attribute.
  • Added palette nonlinearity for vectors with a scaled palette.
  • Fixed the raster-to-vector polygon conversion to include polygon holes.
  • Modified the minimum bounding box tool, and all related tools (e.g. elongation ratio, long axis, short axis) to use an analytical solution for finding the MBB.
  • Added a link to the Whitebox blog (https://whiteboxgeospatial.wordpress.com) in the help menu.
  • Added the Vector Attribute Gridding tool, which can be used to interpolate the spatial pattern of average values of an attribute of vector features onto a raster grid. It essentially can be used to answer the question, of the vector features (points, lines, or polygons) within the a local neighbourhood, what is the average value of some attribute? This can be quite handy for visualizing patterns.
  • Added the Vector Feature Density tool, which is similar to the Vector Attribute Gridding tool but works to map the spatial pattern of feature density (how many vector features are within a local neighbourhood?).
  • Added the ability to rotate map titles. I know this isn’t a proper labelling system yet, but I’m working on it. Hopefully map labels will be added sometime this summer.
  • Fixed yet another bug with the GeoTIFF import tool, this time related to the NoData value. As several of you know, the GeoTIFF format is one of my least favourite data formats, despite being so common, because it is open-ended and there are a great many variants. This makes programming data readers/writers for the format very difficult. If you have troubles with importing a particular GeoTIFF file please email me directly and I’ll see what I can do.
  • I’ve removed the split panel text area at the bottom of the Whitebox user interface and created a stand-alone dialog to handle text output. There was a problem with the split panel on Windows that would not recognize the default size preference, creating a bit of a visual nuisance that I could not fix. The dialog will be automatically displayed when text data is output from a tool and can also be opened from the View menu.

Please report any issues that you encounter as a result of this upgrade using the Whitebox GAT menu, Help -> Report An ErrorAs always, I hope you enjoy this new release and happy geoprocessing!

John Lindsay