The case for open-source GIS

Jan 23, 2018



I have nothing against ESRI, they have been innovators in the geospatial software world from the beginning. I got into GIS from a natural resources background and I know that they have supported the conservation community for decades through their conservation grants program and many other ways. Jack Dangermond’s recent donation of $165 million to the Nature Conservancy to purchase one of the last large undeveloped parcels of southern California coastline stirred my heart and made me well up in tears with pride in the GIS community. I am not opposed to companies selling GIS software for profit. I believe in capitalism. I believe that entrepreneurs should be rewarded financially for producing high quality products at a fair price.

But I also believe that any industry requires competition in order to reach its full potential. That is part of capitalism too. I wonder where the computing industry would be if Apple and Linux had not provided viable, in many ways superior, alternatives to MS-DOS. Competition from both smaller commercial firms and the open-source community did not drive Microsoft out of business but they sure made Bill gates and Co. step up their game to stay on top, and the entire world has benefited. The computing industry has benefited from improved software, but the rest of the world has benefited as well. The Bill and Melinda Gates Foundation has donated billions of dollars to philanthropic causes to improve healthcare and reduce poverty in the past 18 years.

So kudo’s to the Dangermond’s and the Gates families for using their wealth to make the world a better place. That is the promise of capitalism that is often overlooked, and many times not realized due to many individuals personal greed. But here’s the thing. You don’t get to stay on top just because you were the first to get there. You have to continue to be better than the rest. There are no guarantees in capitalism. If you sit on your haunches or make too many miss-steps some young hungry entrepreneur will come along and see an opportunity to knock you off your perch. It’s the way the world works.

The open-source world is booming right now. When I was in grad-school in the late 90’s and early 00’s all of my statistics classes were taught using SAS software. That was very frustrating to me because it was very expensive to get access to SAS outside of the university environment. Towards the end of my graduate program many statistics classes had begun moving to R, an open-source and modern approach to data analysis that has been almost universally adopted by the data science community. When I first began using the internet I paid money every year for an Encyclopedia Britannica subscription. By the mid-00’s I found myself going straight to the open-source Wikipedia more often as there was simply more information available. Programming languages used to be something you would buy. Over the past 3 decades I paid for Turbo Pascal, Turbo C++, Delphi, and several varieties of Microsoft’s Visual Studio. Today almost all of my programming is done with open-source languages such as JavaScript, PHP, and Python.

Open-source software drives the web. The two most common web servers, Apache HTTP Server and Nginx handle over 70% of web traffic and Microsoft’s IIS, the most common commercial web server is a distant third at only 10%. PHP handles the server-side scripting for over 80% of web sites with a server-side component. Open-source databases like MySQL, PostgreSQL, and MongoDB handle massive amounts of dynamic data. On the client side, the story is similar. Open-source browsers, or at least browsers with a large open source component handle an increasing majority of web page views.

Clearly there is nothing problematic about the concept of open source software in general. We all use it almost every day for many things. Open-source software is not a toy and it can be very, very, good. That begs the question, if nobody is making a profit off it, how is open-source software able to compete successfully against large corporations with billions of dollars in annual revenue? What incentivizes users to invest their time, working on software that they aren’t going to profit from?

Why is open-source software so popular?

The cheap and easy answer would be that open-source software is popular because it is free. And that certainly is highly motivating for users of the software. Why should they pay large amounts of money when they can get a perfectly viable alternative for absolutely free. But that answer is putting the cart before the horse in a way. As soon as you postulate that people use open-source software because its free, it simply raises more questions. Why is there very good free software available in the first place? Who would spend their time creating software and then not charge money for it? How can free software be as good or better than commercial software?

In Spanish there are two words that translate to “free” and this illustrates the two ways the word free can be understood. “Gratis” means that something doesn’t cost any money, and that is the way most people think about free software. But there is another word, “libre” that also means free, in the sense of no restrictions, and that is probably a more accurate way of understanding open source software.  Open-source software allows users to view and modify the source code to their hearts content. This means that they can fix bugs or add features if they want.

Now, the vast majority of users of open-source software probably don’t have the technical knowledge to go into the software’s source code and modify it. But some do, and some will, and this ability leads to the formation of user communities focused around popular open-source projects. These user communities drive the development of the software by reporting bugs, suggesting improvements and in some cases fixing bugs and adding new features.

This bottom-up approach to development, based on the needs and desires of users, is vastly different than the top down approach undertaken by most corporate structures which is driven by the need to boost sales and generate profit to satisfy shareholders. This often leads to more emphasis being given to new features that look good on advertising copy, rather than fixing bugs, streamlining processes and optimizing the user experience. There is nothing wrong with new features but in a large complex software package, often only a small percentage of users actually need or want the new feature. What most users need and want is for the basic functions of the software to perform well. They want ease-of-use, robustness, accuracy, stability, and speed. They want software that simply works.

Bottom-up approaches to problem solving

The reason, I believe, that open-source software is becoming so popular is that in many cases it simply is pleasing to use, because its development is driven by end users rather than marketing departments. This user-based, bottom up approach to solving problems has been found to be a successful approach in many arenas. Democracy is a bottom-up approach to government where individuals vote directly on what they believe are the best ideas as opposed to a dictatorship where a single person decides what is best. Capitalism is a bottom-up approach to the economy where individuals vote with their money on which goods and services are the best, as opposed to socialism where government decides which goods and services should be produced. Life itself is a bottom-up approach, in the sense that evolution relies on changes in DNA resulting in different solutions of which only the best are chosen by a process of natural selection. Crowd-sourcing, genetic algorithms based on evolutionary processes and many other examples show that a bottom-up approach can solve many problems better and faster than a top-down approach.

But the bottom-up approach is not perfect. People choose poor leaders and buy crappy products. Some developer’s contributions to an open-source software project may be less than ideal. Most larger open source products are overseen by a foundation whose purpose is to provide some top-down vision and guidance to the development and make decisions about which code modifications should be incorporated into the final product. Foundations also provide a vehicle to accept donations from individual users who just want to give something back to improve the software that they use. Foundations can also take larger donations directly from interested parties for targeted purposes such as adding a needed feature or fixing a problematic bug. A company that needs a feature not available in their current database software may have a long wait if they submit a feature request to Oracle but if they need it bad enough they can probably pay for it and get it developed very quickly with an open-source database, either by the database’s foundation or from a third party programmer. This combination of bottom-up development based on user needs with a little bit of top-down guidance and control seems to me to be an ideal approach to solving complex software development problems.

Open-source is often standards based

Because open-source projects have limited resources, they are less-likely to try reinventing the wheel and more likely to utilize existing libraries, many of which are themselves open-source products. There are also organizations that develop standards that many open-source projects utilize. In the geospatial world, for example, we have the Open Geospatial Consortium, which has developed numerous standardsrelated to geospatial data and analysis. Their Simple Features Specification for SQL (SFS) for example has been implemented in the GEOS C++ library and is used by many open-source GIS products and a few commercial products as well.

Logos of Open-source GIS projects

But ESRI utilizes a slightly different vector geometry model and its own proprietary algorithms.  Many people who come from an ESRI-centric GIS education tend to think that ESRI is the standard because it is the most popular commercial GIS software, but in fact the opposite is true. Most open source GIS software will yield the same results because they implement the same SFS geometry model and the same GEOS vector functions. But ESRI software may not, because it uses its own proprietary geometry model and its own proprietary spatial functions. I don’t say that as a criticism of ESRI. In their defense, they developed their proprietary algorithms long before other standard libraries were available. Rather I say it to ease your mind about using open-source GIS software. You can trust it because it uses standard, well understood algorithms and libraries that have stood the test of time.

Other examples of standard libraries implemented in many open-source GIS projects include PROJ.4 for defining and transforming coordinate reference systems, GDAL for loading and analysis of raster formats, and OGR for loading vector data from a wide variety of formats. In addition to using existing standard libraries where possible some open-source GIS software, such as QGIS, acts as an interface to other open-source GIS software such as GRASS and SAGA and can run their algorithms natively, which provides a tremendous amount of power. Spatial data can be complex and using these standard libraries allows open-source projects to leverage existing knowledge where it exists and focus their efforts on areas where standard solutions don’t exist.

Profit from open-source?

It is possible for companies to generate profit with open-source software. PHP is not strictly open-source. It is free to use but it is managed by a for-profit corporation that generates income be selling integrated development environments, training, certification programs, and other customized solutions for its free PHP product. Some commercial geospatial companies, such as Boundless, provide commercial tech support agreements, training, extensions, and custom development for open source GIS software.

Perhaps more important many companies can increase their profit margins by leveraging open-source solutions. This is especially true with projects that that require multi-user editing and web-based interfaces as that is where the costs of licensing commercial products begin to increase exponentially.

My journey to open-source GIS

As I said, I am not anti-ESRI. I think they are a good company, I used their products for most of my career and I still do occasionally. I was even fired from a job once, which I had held for 7 years, because I refused to pirate their software. I was very comfortable in the ESRI environment and had no desire to learn something new. I also had thousands of lines of ArcObject code libraries that allowed me to develop custom solutions efficiently and I had no desire to re-write them in a new API. I already owned an ArcGIS license, so I wasn’t going to save that expense. But gradually things began to change for me.

I began working for a new company that had some large, complex, and rapidly growing projects requiring multi-user editing. They had been struggling to keep their GIS data organized and integrated with a large amount of non-spatial data related to project scheduling and repeated surveys of existing environmental constraints. They brought me in because of my programming abilities to help streamline their process and avoid some costly mistakes that had occurred because they had so many errors in their data. I was able to accomplish a lot and they were happy with my work, but I was not able to get them multi-user editing. To do that right would have cost in the neighborhood of $70,000 for ArcServer and to upgrade existing ArcGIS licenses to Standard so they could edit data in an enterprise geodatabase. I managed to come up with a work-around that was a bit clunky, and nobody really liked it, but my Boss simply wasn’t going to spend that kind of money and it was the best that I could do.

Later on I was tasked with developing a mobile data collection system for a project that involved about 50 field personnel. Again, I would have been happy to utilize an ESRI-based system but I needed more flexibility than was available with Collector, and I was not impressed with how often it failed in off-line mode, frequently resulting in a loss of data, or at least a time-consuming recovery process. This was several years ago and perhaps it is working better now, I hope so. At the time,however, it just didn’t seem to be a viable solution, especially considering the cost of those 50 annual ArcGIS Online subscriptions, which my boss, again, would not pay for.

So I was stuck again between ESRI telling me that their solutions would cost 10’s of thousands of dollars, my boss refusing to pay, and my projecting managers demanding some type of solution.  It was frustrating for me but I understood all their positions, and I saw an opportunity for myself. Work had been slow for awhile and I had begun teaching myself to develop web-based GIS applications using open source software. I had not progressed much beyond client-side applications with Leaflet but I had begun to look into some server-side solutions that would have allowed web-based editing. Although I had not yet done it, I thought I understood enough to jump-in and I told my boss that if he was willing to give me some time to get past the initial learning curve, I could come up with an open-source approach that would not have any continuing costs or license fees and have nothing to do with ESRI, and that it would be able to do everything my project manager wanted, that was not possible with Collector.

He agreed to give me a shot, and I jumped in with my sleeves rolled-up. I installed PostGIS as a back-end database and began learning a bit of SQL and PHP and developed a mobile data collection system based on a web app deployed on android tablets.  And I devveloped a desktop dashboard that the project managers could use to keep track of the 50 field techs in real-time, QC the data, and even send and receive messages to individual techs. They were thrilled that I gave them everything they asked for, my boss was thrilled that other than my time, it only cost $13/month for web hosting, and I was thrilled to have learned a valuable new skill. It was win-win-win all around.

During that process I also learned to use QGIS as a front-end to manage data in the PostGIS database and I was quite impressed with its capabilities. First of all, with its ability to perform multi-user editing of PostGIS data. This would have saved me many nights sleep on the previous project. Second, after an initial learning curve, I just found many of its workflows to be more intuitive than ArcGIS, especially editing. Third, it’s ability to work with such a wide variety of data-types not available in ArcGIS without expensive extensions came in very handy for working with web-based applications.

At first, I found myself going back to the comfort of ArcGIS fairly often, but as it became more familiar to me I found this occurring less and less and today sometimes months go by that I don’t open ArcGIS. I wouldn’t want to be without both but if I had to give up one it would be ArcGIS, hands down. I am sure that if there were some way to count up strictly the number of features that ArcGIS has it would be greater than QGIS. But I find that the features that QGIS has that ArcGIS doesn’t are features that I really need in my day-to-day work and the features that ArcGIS has the QGIS doesn’t are things that I can do without.  Other people may have different experiences but if it works for you as well as it worked for my needs then I see no reason not to use it.

As the GIS manager of small to mid-sized environmental consulting firms I spent a fair amount of time managing licenses and even more time trying to manage people in such a way as to avoid having to ask my boss to shell out more money for additional ArcGIS licenses. Employees came and went, computers failed or became obsolete, sometimes field projects would require a laptop with ArcGIS and when the project was over I would need to put the license on a different computer for office use. It was a royal headache. And then a couple times a year I had to go and ask my boss to pay maintenance on software that he had already purchased and felt he “owned” which would infuriate him. I fully understand ESRI’s right and need to protect its software but man, dealing with licensing issues was the absolute worst part of my job. What a joy it is to just be able to tell people “Go to qgis.org, download the software and install it”.

There are a lot of use-cases where QGIS just seems to be a better fit. Last summer I spent a month in Guinea, West Africa, teaching a course on soil mapping with GIS and GPS. I asked them if they had ArcGIS available, or if I should teach the class with QGIS. I was told that they didn’t have ArcGIS available but that they probably could get it and asked me how much it would it cost. I almost felt guilty telling them that an ArcGIS license cost more than the median yearly income in Guinea. But when I told them QGIS was free I heard a huge sigh of relief.  I did tell them that ESRI was a great company and that they probably would provide licenses for free, or at least very cheaply, if we applied for it and if they wanted to learn the industry standard I would look into it but I also had to tell them that they could do everything they needed to do in QGIS for free.

I made my own sigh of relief when they unhesitatingly chose QGIS. The last thing I wanted to do was fill out paperwork asking ESRI for free copies of their software, deal with installing ArcGIS on a room full of computers in an environment with very limited internet availability, explain how to keep their licenses current, and honestly, teach topological editing in ArcGIS which I find much easier in QGIS. The students were also thrilled that they could load QGIS on their own computers for free and practice at home (when they had electricity).

To me, being a GIS professional does not mean being a pro at ArcGIS and working within the ESRI ecosystem. It means being able to find solutions to geospatial problems using the best tool available. For some people that may be ArcGIS and that’s great. In my work I have often found that it is a combination of QGIS and PostGIS. Other people may have a good use case for preferring GRASS, SAGA, OpenJump, gvSIG, or some other open source software. There are also other commercial GIS products in the market. I have heard good things about Manifold and Radian Studio and their ability to utilize additional GPU’s to accelerate large spatial operations. I hope to be able to explore those more in the future.

I wish ESRI the best of luck, I really do. As an industry we need them to be successful. I plan to keep up the maintenance on my license, because QGIS is not perfect either and I want to keep my options open. But having been tossed into the open-source pool against my will, I actually found the water to be quite comfortable and I see no reason to go back to the world of licensing headaches and having to tell my boss he needs to cough up thousands of dollars to accomplish something that seems relatively simple.  I’m not saying that your experience will be the same but I do think that if you really want to be a GIS professional you should be aware of the other options that are available, especially if you can explore them for free.

Fortunately, there is no better time to learn about QGIS than right now in my opinion. QGIS 3.0 has not been officially released yet but should be very soon. You can download the release candidate (called 2.99) from their web site. I’ve been using it for several months now and I think it’s a big improvement over 2.xx. If you are a programmer, there have been some significant changes to the Python API and so if you learn PyQGIS3 now you know it will probably be many years before there are significant changes to the API and your code will be viable for a long time.