So, what exactly is ‘open-access’ software?

There’s no doubt that Whitebox Geospatial Analysis Tools is open-source software. It developed under a free and open-source (FOSS) licence called the GNU General Public Licence and it’s source code is publicly available and modifiable. But I often say that Whitebox GAT is an example of open-access software. So what exactly do I mean by the term open-access software? That’s a good question since, as far as I know, I made the term up. Actually, open-access is a fairly common idea these days and has largely developed out of a perceived need for greater public availability to the outputs of academic publishing. The concept of open access is defined in the statement of the Budapest Open Access Initiative (Chan et al., 2002) as the publication of scholarly literature in a format that removes financial, legal, and technical access barriers to knowledge transfer. Although this definition of open access focuses solely on the publication of research literature, I would argue that the stated goals of reducing barriers associated with knowledge transfer can be equally applied to software. In fact, many of the goals of open-access stated in the definition above are realized by open-source software. Therefore, open-access software can be viewed as a complimentary extension to the traditional open-source model of software development.

The idea of open-access software is that the user community as a whole benefits from the ability of individual users to examine the internal workings of the software. In the case of geospatial software, e.g. a GIS or remote sensing software, this is likely to relate to specific algorithms associated with various analysis tools. Cȃmara and Fonseca (2007) also recognized that adoption of open-source software is not only a choice of software, but also a means of acquiring knowledge. Direct insight into the workings of algorithm design and implementation allows for educational opportunities and a deeper level of knowledge transfer, as well as the potential for rapid innovation, software improvements, and community-directed development. I would argue that this is particularly important in the GIS field because many geospatial algorithms are highly complex and are impacted by implementation details. There are often multiple competing algorithms for accomplishing the same task and the choice of one method over another can greatly impact the outcome of a spatial analysis operation. For example, consider how many different methods there are to measure the pattern of slope gradient from a digital elevation model or the numerous flow routing algorithms for interrogating overland flow paths. This is likely one of the reasons that open-source GIS packages have become so widely used in recent years.

So the benefits of an engaged user community with the ability to inspect software source code are numerous and profound, but aren’t these benefits realized by all FOSS GIS? As with anything worth looking into deeply, the answer is probably more complex than it initially appears. It’s true that all FOSS allow users the opportunity to download source code and to inspect the internal workings. This is in contrast to proprietary software for which the user can only gain insight into the workings of a tool from the provided help documentation. But the traditional method of developing FOSS doesn’t lend itself to end-user code inspection.

The concept of open-access software is based on the idea that software should be designed in a way that reduces the barriers that often discourage or disallow end-users from examining the algorithm design and implementation associated with the source code of specific software artefacts, e.g. geospatial tools in a GIS. That is, open-access software encourages the educational opportunities gained by direct inspection of code. It is important to understand that I am referring to barriers encountered by the typical end-user that may be interested in more completely understanding the details of how a specific tool works; I’m not considering the barriers encountered by the developers of the software…that’s a different topic that I’ll leave for another day. Think about how often you’ve used a GIS and wondered how some tool or operation functions after you pressed the OK button on the tool’s dialog. Even if it was an open-source GIS, you probably didn’t go any further than reading the help documentation to satisfy your curiosity. Why is that? It likely reflects a set of barriers that discourages user engagement and is inherent in the typical implementation of the open-source software model (and certainly the proprietary model too). An open-access software model, however, states that the reduction of these barriers should be a primary design goal that is taken into account at the inception of the project.

The main barriers that restrict the typical user of an open-source GIS from engaging with the code include each of the following:

  1. The need to download source code from a project repository that is separate from the main software artefact (i.e. the executable file). Often the project source-code files are larger than the actual program and downloading such a large file can be challenging for people with limited access to the internet.
  2. The need to download and install a specialized software program, called an integrated development environment (IDE), often required to open and view a project’s source code files. A typical GIS end-user who may find themselves interested in how a particular tool works is less likely to install this additional software, presenting yet another barrier between the user and the source code. 
  3. The required familiarity with the software project’s organizational structure needed to navigate the project files to locate the code related to a specific tool. That is, an understanding of the organization of the source code is necessary to identify the code associated with a specific tool or algorithm of interest. Most desktop GIS projects consist of hundreds of thousands of lines of computer code that are contained within many hundred files. Large projects possess complex organizational structures that are only familiar to the core group of developers. The level of familiarity with a project’s organization that is needed to navigate to the code associated with a particular feature or tool presents a significant barrier to the casual end-user who may be interested in gaining a more in-depth understanding of how a specific feature operates.
  4. The required ability to read code written in a specific programming language.  

Each of the barriers described above impose significant obstacles for users of open-source GIS that discourage deeper probing into how operations function. There may be other barriers, but those listed above are certainly among the most significant. Whitebox GAT attempts to address some these issues by allowing users to view the computer code associated with each plug-in directly from the tool’s dialog. Thus, just as a detailed description of a tool’s working is provided in the help documentation, which appears within the tool’s dialog, so to can the user choose to view the actual algorithm implementation simply by selecting the View Code button on the dialog.

The Clump tool dialog. Notice the View Code button common to all tool dialogs.

The Clump tool dialog. Notice the View Code button common to all tool dialogs. Click on image for enlarged version.

This removes the need to download separate, and often large, project source code files and it eliminates the requisite familiarity with the project to identify the lines of code related to the operation of the tool of interest. Furthermore the tool’s code will appear within an embedded window that provides syntax highlighting to enhance the viewer’s ability to interpret the information. The View Code button is so much more than a quirk of Whitebox GAT; it’s the embodiment of a design philosophy that empowers the software’s user community. This model has the potential to encourage further community involvement and feedback. Among the group of users that are comfortable with GIS programming and development, the ability to readily view the code associated with a tool can allow rapid transfer of knowledge and best-practices for enhancing performance. This model also encourages more rapid development because new functionality can be added simply by modifying existing code. The 1.0 series of Whitebox, developed using the .NET framework, even had the ability to automatically translate code written in one programming language into several other languages, thereby increasing the potential for knowledge transfer and lessening Barrier 4 above. Unfortunately this feature could not be replicated when the project migrated to the Java platform although there are on-going efforts to implement a similar feature.

So, that’s what I mean by open-access GIS. I think that it is a novel concept with the potential to significantly enhance the area of open-source software development in a way that will benefit the whole user community. So when people ask me why I bothered to write my own GIS when I could simply have contributed to one of the many successful and interesting open-source GIS projects that are out there, my reply is usually centred around the need for an open-access GIS. Some would say that I am an idealist, but oddly, I tend to think of myself as a pragmatist. In any case, the world could benefit from more idealists, don’t you think? If you have comments, suggestions, or feedback please leave them in the comments section below. And, as always, best wishes and happy geoprocessing.

Note: this blog is based on sections of a presentation that I gave at GISRUK 2014 and a manuscript that I am preparing for publication on the topic.

Advertisements