Tangling with Open-Source

Frustration is necessary.

Right now, I’m breathing a sigh of relief. I’m letting many weeks of pent-up frustration slip away. And at the same time, I’m realizing how glad I am to have encountered the uniquely enlightening source of that frustration.

I recently concluded work on a project centered around virtual machines. It was an academic setting, so we were limited to a non-proprietary solution — i.e entirely reliant on open-source software. It didn’t take long for me to love, hate, and more importantly, to give true thought to open-source for the first time in my life.

I came to realize how impressive it was that a team of volunteers could create something so complex, so complete and functional, and not stick it with a price tag. I also came to join the ongoing queue of people complaining about the expectable but nonetheless despised flaws and bugs in those same programs.

Of course most people, myself included, have used programs and tools from the open-source domain before, whether intentionally or not. Plenty of software is released under GNU or similar licensing, and if you go searching online for “free ____ program”, chances are your final download is either open-source software or a virus.

Obvious virus is obvious.

But somehow this was different. This wasn’t CamStudio or Peazip or any other impressive but limited-use application. The more specific your goals, the more esoteric your task, apparently, the more likely you are to encounter software that is less robust, appealing, and full-featured than its proprietary counterparts. Case in point, software such as Hadoop and Eucalyptus, two tools which I met, used, and became both delighted and frustrated by. Both are acknowledged as quality software, but are fragile when it comes to variations in OS, system architecture, etc. And both tend to output enigmatic errors.

“All datanodes are bad.”

“Unable to create new native thread.”

“Too many open files.”

“Connection refused.”

Sources like StackOverflow had the solutions to only some of these. Even when known solutions existed, they often needed major adaptations, or simply failed to function due to some quirk of my systems compared to the standard setup — often, not an obvious one. And I, as a newcomer, spent hours struggling to make my open-source tools accomplish tasks that would likely have taken an accomplished hand mere minutes. If this had been proprietary software, I could appeal to the often verbose manuals, guides, etc, or send an email looking for the specific solution to a known bug. Thus the cost of freedom here becomes time and frustration. Worth it? I suppose that depends on perspective.

As with any solution, TANSTAAFL. There are advantages and disadvantages. In my situation, open-source was not only helpful but necessary (though if you had caught me three hours into trying to resolve an obscure Hadoop file system issue, I would likely have disagreed vehemently). In more highly time-bound situations, reliable, robust tools with a support center are obviously the better choice. Is the split a clear one between academic settings and industry? Not necessarily.

The choice is always tricky, and as with many things, you may not know which option is best until it’s too late. But there are guidelines — like cost v.s. reliability, simplicity v.s. support — and the case can be made both ways.

In any case, I’m glad to have experienced the joys and frustrations of using open-source software for a worthwhile and not insignificant project. Experience with both sides of an issue can only be of help in future projects and plans. And at least from now on I am lucky enough to join in on bellyaching about the major headache of open-source software (when we’re not too busy praising it for all the ways it’s helped us).

— M(C)B

Et Cetera

One thought on “Tangling with Open-Source”

Tags

“All datanodes are bad.”

“Unable to create new native thread.”

“Too many open files.”

“Connection refused.”

One thought on “Tangling with Open-Source”