28 July 2015

Hands-on Review: Xinuos OpenServer X

If the title baffled you a bit, then you're not alone.  Here's the backstory:

  • A company called UnXis was founded in 2009 to sell (you guessed it!) UNIX operating system software.
  • UnXis was built on the assets of the now long-defunct and bankrupt SCO Group, including the variously XENIX- or SystemV-based UnixWare and OpenServer products, but appears to have since severed all ties with the SCO Group, and also made a very public and pointed disavowal of patent trolling.  Smart move, methinks.
  • In 2013, UnXis dropped that awful camel-case name in favor of the much more straightforward Xinuos.  It took me a few minutes to figure out that the company has nothing to do with Xinu, a fact which left me slightly bitter.
Are you still with me?  Good.  Because it's about to get even more bizarre.

Xinuos OpenServer X has absolutely no connection "under the hood," other than in the most ancient and pedagogical senses, to the older UnixWare or SCO OpenServer lineages: those were, as mentioned previously, derived from XENIX or System V UNIX, whereas OpenServer X is based directly off of FreeBSD.

Furthermore, I couldn't help but note my own tendency to abbreviate the operating system's name, in my own mind, as "OS X."  I wonder how a certain fruity company feels about that, if indeed said company is even aware...

Now, let's move on to the system itself.  I'm going to be very candid here: it is, arguably, stock FreeBSD with a bit of "shellac."  Xinuos has simply replaced the FreeBSD name and brandmarks in various places with OpenServer; replaced the Beastie with a large two-tone "X," and added a few binary packages and some GUI sugar by default.  The installation DVD and USB images first launch into a bootloader virtually identical to that of "stock" FreeBSD, presents one with similar options (adding, notably, the option of whether or not to boot into a GUI), then proceeds to jump into the FreeBSD kernel.

The usual FreeBSD kernel messages whiz by your screen.  You get a warning if you have less than 8GB of RAM installed (8GB?  Really?) and are then dropped, if you allowed the default boot to proceed, into an XFCE4 desktop environment.

The desktop environment is very well-themed, I'll admit.  Everything melds together nicely, and it certainly looks quite professional.  There's an icon for administration tools (I'll come back to that in a minute) and another for an installer, in addition to the usual file browsers and help documents.

The installer is an odd creature.  It mixes, in a rather haphazard fashion, the text and buttons almost verbatim from the default console-based FreeBSD installer in GTK+ windows, with the occasional escrow of the process out to a terminal window.  My guess is that it's running the FreeBSD installer in the background, bringing some of the simpler dialogs into GTK+ views, and just showing the otherwise-hidden installer for more complex interaction.  The installer proceeds, gives you the usual partitioning, service configuration, networking, and user setup options, and then offers to reboot for you.  How thoughtful!

If you're not impressed so far, there is one redeeming bit coming up.

Post installation, the system boots by default into the SLiM login manager (again with a custom theme, of course), then launches XFCE4 again.  At your new desktop, still well-themed, you have the same icons as the installation desktop, minus the installer itself of course.

At this point, I decided to try my hand at the administrative console, and I was in for my first pleasant surprise of this entire experience.  The web-based admin console is an absolute delight: it's well-organized, completely functional, and comprehensive.  My immediate reaction, of course, was "who else's work is this?"  My skepticism proved not to be unfounded, as poking through the nginx configuration files and the web root shows that the web UI is the FreeNAS web UI with some custom branding.

So, what's my overall impression of this "new operating system?"  It makes an excellent, well-rounded, intuitive FreeBSD distro.  The OpenServer X team has put considerable effort into neatly integrating and streamlining various other tools.  Having said that, it still has some rough edges (particularly the installer jumping between GTK+ dialogs and a terminal), and I also take umbrage to Xinuos branding it as a custom-built operating system when a far more honest appraisal would be a FreeBSD "convenience" distribution.

I'd recommend this for anybody who A) wants a FreeBSD server without a lot of the headache traditionally involved in setting one up, and/or B) feels that PC-BSD is a bit bloated.  I do not feel that Xinuos should charge for the product itself, which they are wisely not doing at the moment.  They could, however, quite justly follow the standard community-driven model and charge for support, and I'll call them a decent FOSS contributor and move on with my day.

11 June 2015

Hands-on Review: Cowboy Web Server

Web servers are a dime-a-dozen these days.  I just barely remember the days when developers had to choose between standalone web servers, versus those tightly-integrated with or into their applications.  That distinction is almost non-existent at this point, and most modern web servers can be configured to use either approach, based on a project's requirements.

There is, however, still a spectrum of sorts with regards to APIs and languages.  The more general web servers (Apache HTTP Server, nginx, et al.) are intended to be language-, platform-, and API-neutral, while most of the "Web 2.0 and later" languages have at least one web server that can be integrated directly into applications.  Ruby, in particular, has more options in this "integrated server" department than you can shake a stick at; in fact, some might say that it has too many options, and that like the "Cool Kids" in Garrett's presentation they tend to be a bit too focused on buzz words at the cost of substance, but I digress...

Erlang, for all practical purposes, has a total of five options- three in the "old guard":
  • inets, a package included in Erlang's basic "OTP" distribution
    • Easy to work with, well-documented, but lacking some out-of-the-box features
  • Yaws "Yet Another Web Server," is a massively-concurrent juggernaut
    • Back-end code is written in a structure that should be familiar to PHP devs.  It pretty consistently out-performs Apache with large numbers of simultaneous connections.
  • Mochiweb is a lightweight web server toolkit
    • Not really a complete web server, but a useful set of routines that's a bit more complete than the stuff in inets.
...and two* in the "newcomers club":
  • Elli, which is intended to take the best aspects of the old guard and bring them together
  • and, finally, Cowboy, which takes a few radical departures from the traditional models
* a third newcomer, Mitsultin, recently announced an end of active development

All of these server applications and libraries have their own merits, but Cowboy has interested me in particular because it takes some radically different approaches to the web server model.  Your application is in control of the web server, not the other way around, and you can do anything from direct socket-based interaction all the way to HTML copy-pasta, as suits your needs.

From some extensive poking and prodding, Cowboy appears to have three "layers," two of which are mandatory.  They are, from lowest-level to highest-level:
  • The "listener" layer, responsible for accepting connections on TCP sockets, doing some initial low-level decision making, and spawning a new higher-level process (per connection) to take things from there
  • An optional middleware layer, which provides a nice modular way for developers to avoid re-inventing the wheel.  You can just drop one of many middleware modules in here to handle all sorts of "boiler plate" application functionality, like authentication and session handling, security auditing, etc.  These middleware layers do their magic dances with the web clients, and expose the relevant information to the next layer through small, unintrusive function calls.
  • The "handler" layer, which is where your (yes, I'm looking at you, dear reader) application lives - this layer can be anything from the aforementioned raw socket to direct HTTP, as mentioned earlier, and depending on your configuration of the corresponding listener entry.  All of the information from both the listener and the middleware is available here, too, so you can make intelligent decisions on content delivery.
That modules-in-layers paradigm leads to some nifty things.  The intelligent content delivery that I just mentioned is a serious boon compared to other web servers that I've used.  Getting information on the logged-in user, for instance, can be as easy for you as a single function call, and you don't have to worry about actually making the authentication happen if you don't want to do so (and providing that there's a middleware plugin available for your auth system, which there probably is).  In another example, based on whether or not a connection is encrypted, you can deliver not just different pages, but different text strings or backend logic within the same "page."  It's just a simple { Tuple } = function( ... ) pattern match operation in your application code, and you can then do any conditionals or guards that you like.  The same goes for user authentication state (if you've told the optional middleware to let users get to you at all without auth, that is), web client info, HTTP headers, etc.  All of the information is just there, waiting for you to act on it.

The security-conscious will no doubt be alarmed at this point.  "Wait," you might say, "how do we ensure that an attacker can't get to all of that information."  Though there's no such thing as a fool-proof application, of course, Erlang [unintentionally] mitigates much of this risk for you.  Since it's a compiled language, with even a modicum of care you don't have to worry about code injection into the backend.  What's more, due to its functional paradigm there isn't really any "global namespace," so, short of major bugs in the Erlang runtime (which has had nearly thirty years to stabilize), each particular function in your application only has access to the data that you give it at compile-time.

As if all of that is not enough to like, Cowboy is fast.  I'm talking 'cheetah after ten cups of coffee' fast, here.  The usage of a compiled language (particularly if you use BEAM's native code compilation), plus the tight integration between web application and server, leads to some awe-inspiring response times.

Now, lest I start to sound like a used car salesman, there is a downside of all of this awe-inspiring 21st-century goodness: for those of us used to more traditional web servers, the learning curve here is steep.  It's not just the fact that you pretty much have to learn at least a small amount of Erlang, either.  Imagine, for example, if you had to actually re-compile your Apache httpd.conf after making a configuration change!  Now, rebuilding Cowboy's listener for your application takes less than a second on even a rather low-horsepower system, but it's just something that the uninitiated would never even think to do.  There are plenty of other "gotchas" along those lines.  The biggest challenge of using Cowboy is then, in my own opinion, the hours of repeated head-desking and hair-pulling required to learn how to use the tool.  Once you've figured it out, you're walking on sunshine, of course, but the road to get there is dark and terrifying... and that, too, fits right in with the Erlang ecosystem.

26 May 2015

Hands-on Review: Fedora 22 Workstation

Fedora 22 was officially released today.  Under normal circumstances, I'm all about the cutting edge, and would likely have reviewed it while it was still in beta some months ago.  In this case, the most I could manage was two days early.  At this point, though, I've had a solid three days with it, and have poked and prodded it extensively.

So, right out of the gate, I'll admit it: I did a yum-based upgrade.  The Fedora Project does not recommend this method, so unless you're really comfortable with the internals of the distribution, if you're upgrading then you should probably stick with the official FedUp tool.  Having said that, I do have quite a bit of faith in Btrfs snapshots and my own ability to pass "rootflags=subvol=fc21_working" to the kernel during boot.  In any event, the process worked fine, and the recovery tricks weren't necessary.

There are several changes in the workstation version, and even more in the Cloud and Server versions for that matter.  The following, however, leaped out at me immediately (on a ThinkPad X1 Carbon gen 2, for those interested):

  • Kernel 4.0.4
    • This change isn't as intimidating as it seems.  Mostly, this is just an incremental change, and could just as easily have been version 3.20 .
    • The biggest change here is that the code necessary to support reboot-free kernel patches (available for some time with third-party tools) is now available in the mainline kernel.  The functionality isn't readily exposed in workstation versions, so don't expect it to Just Work™, but anybody feeling brave can try it out with kGraft or Kpatch.
    • Yes, for those of you who don't follow the HA GNU/Linux server world- you read that correctly.  Linux has been able to do reboot-free kernel patches for some time with third-party tools.  Kernel 4 just brings that into the standard kernel tree.  Bet you don't hear MS or Apple talking about that one in their product demos, eh?
  • Faster startup
    • My boot, from power-on to GDM login dialog, was already about five seconds (not including GRUB menu timeout) with Fedora 21, courtesy of a good SSD.  Fedora 22 shaved about one second off of that.
  • Yum is deprecated, long live DNF!
    • Yes, you read that one correctly, too.  Good ole' yum is still around if you really want it, but it's been officially deprecated in favor of a new tool, dnf.  DNF has virtually complete syntax compatibility with Yum, but has noticeably more zip-zoom, particularly for complex dependency resolution.  There's even a new version of yumex (which retains the name, oddly enough) particularly for DNF.
  • Using libinput by default, instead of discrete X.org drivers
    • Many users probably wouldn't notice this at all.  As I have a Synpatics trackpad, however, the initial trackpad behavior was a bit odd.  The tracking and acceleration were different, but that was a quick adjustment.  My two-finger right-click was also gone (not sure if that's libinput or Gnome 3.16), but it was quickly restored with dconf editor.
  • Notifications are much nicer
    • This is probably more Gnome 3.16 than anything Fedora-specific, but it's nice.  Any notifications now show up briefly under the clock, and I can dismiss them completely, or they'll vanish on their own, and clicking on the clock shows a list of any notifications that haven't been dismissed.
So, the summary: going from Fedora 21 to Fedora 22 was a nice series of small improvements that, together, add up to a rather pleasant experience.  The upgrade was reasonably painless, and so far all looks well.  My next projects will be to mess around with the integrated Docker functionality, and to get reboot-free kernel patches working with kpatch.

Also worth noting: I was a devoted openSUSE user for several years, mostly courtesy of Yast.  Fedora's system-config-* tools, while not quite a drop-in replacement, work well enough that, in tandem with the more widespread software support, I think Fedora is now my distro of choice.

28 March 2015

The Rise of Elitism in FOSS


WARNING: This post consists primarily of opinion, rather than fact.  You might even quite fairly call it a 'rant.'  Please don't take anything here as a challenge to your identity or beliefs, though I do hope that you find something useful in it.

It's a simple statistic: the Linux kernel is currently, by a large margin, most common foundation for FOSS operating system distributions.  That kind of unification is something that I could scarcely have imagined fifteen years ago, and I'm completely in favor of it.

There is, however, a darker side to the success of the various GNU/Linux distributions (to say nothing of Android): quite a few software authors and hardware vendors have fallen under the enchantment that supporting the Linux kernel and the GNU userland, to the exclusion of other platforms, constitutes fully embracing open-source software.  I'd like to assert the potentially dangerous opinion- and I'm considering buying a set of full-body armor for this- that it doesn't.  In fact, I'm going to go so far as to suggest that supporting only a single operating system, even one both as open and as popular as GNU/Linux, to the exclusion of others is antithetical to the spirit (though perhaps not the definition) of "free and open-source software."

How can I say that, when something that is both... well... free and open source... is the very definition of the term?  Because, in my twenty years of dealing with FOSS, there has always been an implicit understanding that "free and open-source" also generally implied "cross-platform."  That was certainly not ubiquitous: there were always a few FOSS applications that were just for Windows, or just for Mac OS (X and 'classic').  The bulk of FOSS software, and reliably the most popular tools, were generally made cross-platform.  They might start out as Linux-only, or Windows-only, or Mac OS-only, but were ported within short order.

The graphical terminal client PuTTY is a great example of the above model: it started out as a Windows-only program, and it filled a niche there which was otherwise unoccupied (i.e. free graphical SSH client and related tools).  Later, as Linux gained more mainstream popularity, many users coming from Windows were intimidated by the different-but-perfectly-capable OpenSSH clients in the terminal, and wanted their familiar PuTTY GUI client.  So the author of the original piece got some help, put commendable work into it, et voila!  There's now a Linux port, and there are rumors of official ports underway to both Mac OS X and 'classic' (yes, you read that last one correctly).

At the risk of sounding like the quintessential old man shouting at the kids to get off of his lawn, that's how FOSS should be.  The original developer(s) may or may not have time and skill to port their software to other platforms based on demand, but if others are willing to take up the mantle and do the work, more power to them!

That's not what I'm seeing in many open-source projects right now.


I've recently encountered, for the first time in my twenty years of dealing with FOSS, an aversion to and scorn of porting software to other platforms.  What's worse is that it seems to be contagious.  The most nefarious villain in the room right now is the Direct Rendering Manager.  For the uninitiated, it's a graphics stack component, originally written for Linux, that allows direct communication between graphical client software and the supporting graphics hardware, making e.g. accelerated 3D graphics possible, or at least less painful, on such systems than the traditional X11 client / server model would otherwise allow.

Reading the freedesktop.org page linked above, and the corresponding Wikipedia entry, one might be led to believe that DRM is only for Linux.  Neither page has any mention whatsoever of DRM being used, available, or supported on other platforms.  Unfortunately, nothing could be farther from the truth.  I use 'unfortunately' here in a very encompassing sense- both the omission itself and the usage on other platforms are unfortunate.

DRM is used on Linux, all of the modern BSDs (FreeBSD, NetBSD, OpenBSD, DragonflyBSD), Solaris (since the OpenSolaris days), and probably quite a few other lesser-known FOSS operating systems besides, and probably even some otherwise closed-source 'Unix-like' systems as well.

Good luck telling anything in the above paragraph to the "official" developers; they don't want to hear it.


The only "official" DRM is the Linux DRM.  Other operating systems might call their equivalent projects DRM, and they might be derived from the same code base, but oh no!, they're not the DRM.  Developers on other platforms (including myself) often face (at best) disregard or (at worst) scorn, vitriol, annoyance, or outright hostility just contacting the "official" DRM project with questions, trying to understand a particular piece of code or function call, for the sake of porting it.  As a direct result of this hostility and the "our castle" mind set, otherwise-capable developers avoid the project, which is a major contributing factor, though by no means the only one, to accelerated graphics support on those other operating systems often being years behind Linux.

So why use DRM / DRI at all?  One word: X.org!  The debate and dissent over which FOSS X11 distribution would become the de facto standard for FOSS 'Unix-like' operating systems was settled more than a decade ago, for better or worse, and X.org is the only free and open X11 implementation with comprehensive driver support, maintained and updated, and (perhaps most importantly) supported by hardware manufacturers.  If you're not running X.org, good luck getting that latest graphics driver from Intel or NVIDIA to work!

Oh, and by-the-by, if you're on a FOSS operating system and you have anything even remotely current from Intel or AMD in terms of graphics hardware, you'd better have DRM and KMS or you're up that proverbial and 'fragrant' brown creek, and without any of the proverbial paddles.

In short, in terms of porting, the whole kit comes together, or not at all:


X.org, its drivers, current DRI/DRM, and KMS.  So, do you see why having half of the kit (the X.org part) more than willing and happy to be cross-platform, while the other part (DRI / DRM) resents it, presents a problem?  I thought so.

To add insult to injury, the same individuals who so vehemently resent the porting of DRM to other platforms then like to poke fun at those other platforms for lagging behind.  The situation rather reminds me some sick, geekish pantomime of the teenage bully who uses his victim's hand as a weapon on the victim's own head, all the while chanting, "stop hitting yourself!"

That would be bad enough in an isolated incident, but it's becoming quite common.  Certain authors of software for Linux have begun the duplicitous behavior of advocating the "free" part of FOSS, while simultaneously resenting or scorning the "open" part.  Mercifully, it's still the exception rather than the rule, but I find even a single instance of it to be counterproductive.

How has this happened?


Well, guesses abound, but I think that the answer is simple: it's human nature.  My observation has been that, for quite some time, the advocates of and contributors to FOSS had been part of a niche movement, a minuscule minority, and represented some of the more altruistic (if also quite eccentric) elements of society.  As the FOSS movement has taken off, it seems that more typical human demographics have become common, bringing the typical human baggage with them, including rampant elitism.

This, too, shall pass...


Luckily, I don't think that this is the swan-song of the FOSS movement.  This is purely optimism on my part, and admittedly almost an act of (dare I use the term?) "faith," but I think that the situation will correct itself in time.  My guess is that elitism in FOSS projects will hit a peak in a few years, and that the situation will be rather grim for a while.  Every FOSS operating system but Linux will fall behind for a while, but there will always be people who resist main-stream ideas.

My guess is that Linux will eventually become too main-stream for the most talented developers, and there will be a shift away of the greatest talent back out to other FOSS operating systems for a while, and Linux will hit a period of stagnation and (even more severe) fragmentation.  That will be the "catch-up" period for other FOSS platforms, before the pendulum eventually swings back the other direction.

We can look at history, politics, the economy... in fact, just about every element of human culture, and see oscillations in ideas and values over time.  I suspect that the FOSS movement will prove itself to be no different, and that popularity will vacillate over the course of decades between the more militantly-open systems (GPL licenses, Linux, etc.) and the more permissively-open (MIT / BSD style licenses, the BSDs and OpenSolaris, etc.).

Am I right in those guesses?  Only time will tell.

EDIT 1 (10 December 2015): One prominent example that I suspect is related is Sarah Sharp's "Closing a Door"

22 February 2015

The System Programmer's Creed

10    The Kernel is my supervisor, I shall not enter Ring Zero.

20    It maketh me to write well-formed code; it leadeth me to respect the hardware.

30    It restoreth my stability: it leadeth me down the paths of protected memory for the sake of its drivers.

40    Yea, though I walk through the valley of the shadow of panics, I shall fear no segfault: for the Kernel is with me; its spinlocks and its syscalls comfort me.

50    It preparest an API before me in the presence of mine kludges: it escalates my privileges; my nice value is negative twenty.

60    Surely uptime and free memory shall follow me all the days of my career; and I will code in harmony with the Kernel forever.

21 February 2015

The Death of Hardware RAID... Good Riddance!

There was a point in the not-too-distant past when hardware-backed RAID storage was an indispensable part of the toolkit of any server administrator.  The basic idea of RAID (Redundant Array of Inexpensive Disks, for the uninitiated) is to mitigate the risk of data loss problems associated with hard disk failures.  You have a set of disks, i.e. an array, and if one (or sometimes if more than one, depending on the configuration) fails, your data is still available... often in a manner transparent to the operating system and/or end user.

Like everything else in computing, RAID moved in generations, with different configurations in different generations having different capabilities:

  • RAID 1: The earliest arrangements that could be considered RAID were shipped in 1983, though the actual term RAID wouldn't be coined for several years.  These earliest configurations consisted of mirrored pairs of drives, i.e. two drives with the same contents that appeared as a single drive to the system.  For two drives of the same size, the total amount of available storage is that of a single drive.  RAID 1, also known as mirroring, is still in common use today for some scenarios.
  • RAID 2: This early implementation was a highly-performant mode that striped individual bits across drives, and increased the available capacity while still providing error correcting via Hamming code CRCs on one or more dedicated drives.
  • RAID 3: A rare implementation using a dedicated parity drive, in which bytes (as opposed to individual bits as in RAID-2) are striped across drives, this level was never deployed widely.
  • RAID 4: Very similar to both RAID-2 and RAID-3, except that entire blocks, instead or bits or bytes, are the unit checksummed and striped across drives
  • RAID 5: The de facto RAID mode for server admins for many years is novel in striping both data and parity across all drives, which improves performance and ensures that all drives wear evenly, unlike levels 2 through 4.  A minimum of three drives is required for RAID 5, with the capacity of approximately two drives available for data.  RAID 6 is almost identical to RAID 5 save that parity is doubly-redundant, allowing up to two drives to fail without compromising data.
There have also been various stacked / combined levels, one of the most notable of which is referred to as RAID 10, a stripe of mirror pairs.

All of these methods also have some serious shortcomings:
  • While all of these systems address the problem of data availability, only the most advanced (and therefore expensive) implementations give a tinker's d*** about data integrity.  A recent well-written article by Jim Salter, including some handy experiments, documents the problems of RAID and bit rot.  The short version: RAID protects against complete disk failure, not against subtle disk corruption.
  • Hardware RAID is (or was, until very recently) both expensive to implement, requiring dedicated controllers, and tedious to maintain, requiring experienced personnel and complex drivers for every supported platform.  It was formerly a hobby of hardware and software manufacturers to gang up on IT professionals by only allowing certain hardware to work with certain software in certain supported deployments.  Of course, those days are long gone, right? *wink, wink, nudge, nudge*
  • Software RAID, while virtually free to implement and much easier than hardware to maintain, suffers from significant performance problems.  It's also very rarely portable across operating systems.
After all of these years, the relationship between IT folks and RAID has turned sour like a shotgun marriage...

So what's the alternative?  Enter next-generation file systems.  That's right- file systems.  Traditionally, file systems just organized files, and it was up to volume management solutions like RAID to keep the file systems consistent and happy.  As the problems with RAID were exposed, and personal computing devices (where RAID is rarely an option) became more widely used, file systems themselves became more resilient to compensate.

The next generation of file systems is a bit different.  The solutions combine both volume management, typically including RAID-like capability, and file systems into a single all-encompassing data storage mechanism.  This allows them to do all sorts of amazing things around data storage, with tricks to both save space and protect data integrity, all while inducing a negligible performance penalty when properly provisioned.  Examples include:
  • The cross-platform ZFS, originally developed by Sun Microsystems and sent out in an open-source lifeboat before the company sank, was the pioneer of next-generation filesystems.  It supports mirroring, striping, and RAID-Z (like RAID-5) and also supports nesting of those different types.  It also supports block checksums, in-line block-level deduplication, live snapshots and rollback, filesystem clones, transparent compression, transparent encryption in some implementations, differential backup and restore, etc. ad nauseam... Short version: it's awesome, there's a Free & Open Source version available via the OpenZFS project, and it's available on most *nix platforms.
  • Linux's Btrfs has a similar feature set, with integrated volume management and RAID-like capabilities, plus snapshots, plus transparent compression (notably with file-level granularity).  Deduplication is done outside of the write path, to conserve memory.  Btrfs also has a special mode of operation for solid-state drives which reduces unnecessary writes, in addition to the TRIM command also supported by some ZFS implementations.  While not as mature as ZFS, Btrfs is quite stable and usable and most environments, and seems to be preferred over the former for Linux usage.
  • A lesser-known option, HAMMER (no connection to the nefarious baddies in the Marvel universe), is available only on DragonflyBSD at the present time.  It has a somewhat limited feature set compared to the previous two options, but still does deduplication, checksums, etc.  RAID-like functionality is not yet implemented directly, but rather through a streaming backup mechanism.
These next-generation filesystem-and-volume-manager combos all attempt to address the shortcomings of the traditional filesystem / volume manager model, and do so rather effectively.  But hang on a second, you ask: we've heard about Solaris, the BSDs, and Linux.  What about the two major consumer operating systems: Mac OS X and Windows?

In terms of major consumer operating systems, while Apple's default HFS+ file system is practically a geriatric patient in technological terms (having not changed much structurally since the mid-90s), there is a stable, well-maintained, and free build of OpenZFS available for Mac OS X.  Recently some work has even been done to make it bootable.  It's not officially blessed (pun intended- please tell me that somebody gets it) by Apple, but it does the trick for those who want and/or need it.

Windows users are almost out of luck!  The most capable file system offered by Microsoft at this time is ReFS which, when combined with Windows Storage Spaces, isn't terrible... but it's also nothing amazing either.  Even an article by a Microsoft Kool-Aid drinker fanboy advocate admits (starting at paragraph 16, for those looking for it) that its feature set is unimpressive compared to those other solutions.  The only selling point that he can offer over ZFS is reduced memory footprint... which any server admin knows is practically a farce when talking about Windows- quite possibly the most notorious modern OS when it comes to memory thrashing.  Its also still a discrete filesystem / volume manager solution, with all of the inherent limitations of that design.  Nevertheless, it still has significant advantages over hardware RAID, like the ability to detect bit-flips with opt-in (seriously?  you have to turn it on manually, from a command line?) block-level checksums.

While there are those out there still defending hardware RAID, they must do so with the caveat that it should no longer be considered the panacea as which it was regarded for many years.  If you're a server admin in 2015, and you're still using hardware RAID as your go-to redundancy solution without any hesitation, you should probably take a good, long look in the mirror and reevaluate your life.  Start by asking which of those solutions above could bring amazing data integrity goodness to your microcosm.

09 February 2015

Selecting the Right Virtualization

In the not-too-distant past, talking about a product "ecosystem" around virtualization would have been like talking about snow in the Sahara- it just didn't exist.  In 2015, though, things have changed.  There are so many different types of virtualization that making the right selection for a particular usage need can be a good way to wind up in the mental ward of your local hospital.  Broadly speaking, here are the different types "summed up," though there is much more that could be said:

  • Imitate hardware - a complete "computer within a computer", high overhead
    • Emulation - instruction-for-instruction imitation of CPU, devices, memory, etc.
      • Pros: very accurate, reliable, and secure
      • Cons: very slow
    • Virtualization - take some shortcuts where possible, imitate the rest
      • Pros: reasonably fast
      • Cons: can sometimes be breached, not all hardware can be used
  • Imitate an operating system - just enough isolation to fool programs, low overhead
    • Full environment - "container"
      • Pros: highly flexible and "multi-purpose"
      • Cons: difficult to set up and maintain
    • Single application - "sandbox"
      • Pros: simple maintenance and deployment
      • Cons: keeping many of them organized is a chore in its own right
There are also several technologies that combine both categories, such as QEMU's "user mode" wrappers that provide architecture instruction set translation like emulators, but don't emulate any hardware and so behave like sandboxes.

I've tried just about every solution under the sun at this point (pun not intended; if you get the reference then you're my new best friend), and the following flowchart is based on my own experiences for what works and what doesn't in any given situation.  Note that I do not attempt to account for very special-purpose cases like cloud computing infrastructure- these are orders of magnitude greater in complexity, and would require a chart far more complex that what can fit here:


Feel free to use / distribute / wipe your a** with the chart as you choose.  It's also worth noting that two of my personal favorites, FreeBSD jails and Solaris containers, are not listed here.  They require some higher-level specialist and/or institutional knowledge that your typical organization will likely not have on staff.  They're also more powerful than any of the other container / sandbox solutions mentioned above, with the *possible* exception of Parallels Virtuozzo.  Sorry, lxc and Docker- you're just not there quite yet, in my own humble opinion.

08 February 2015

Comparison of OpenSolaris Derivatives

After Oracle effectively neutered the OpenSolaris project, many spin-offs of the public source code popped up, like oh-so-many mushrooms in a fallow field.  Many short-lived derivatives have come and gone, and there have been some surprising changes in the ecosystem.  I decided that it was time for a re-evaluation of the available variants.

It should be noted that the goal here is to compare these various OpenSolaris spin-offs against each other, not against other FOSS operating systems.  As such, I focus primarily on the quantifiable traits that differentiate them.  Things that they all have in common are omitted.  Traits like the software packaging system in use are not quantitative, only qualitative, so that's not of interest here (though it is a serious consideration for usability).  Similarly, capabilities relative to other operating system families like GNU/Linux, the BSDs, or Windows are outside the scope of this comparison.

In considering each variant, I looked at the following attributes:

  • Download ease - could a download be quickly and readily obtained from the variant's main page?
  • Version maintenance - has an official release been made within the last 365 days?
  • Documentation availability - could installation and setup documentation be found easily from the download page?
  • Documentation maintenance - does the available documentation cover up to the most recent official release?
  • Boot capability - does the variant support EFI boot on x86-64 "out of the box"
  • Disk label recognition - can the variant read GPT disk labels, at least in a non-boot capacity?
  • VirtIO support as a KVM guest - VirtIO block devices, network devices, memory ballooning, CPU hotplugging, and serial devices are all considered
  • Ability to act as a KVM host - pretty self-explanatory, all via the Joyent illumos-kvm project

Each attribute affords a total of one point- all are boolean, one or zero, with the exception of VirtIO support, with each of the five components counting for 1/5 of a point.  The total ranking is then expressed as a percentage and a letter grade.  Here is the summary (full results available in this PDF):

NameTOTALGradeVersion Tested
SmartOS85%B3783
OpenSXCE85%B2014.05
OmniOS85%B151012
Oracle Solaris65%D11.2
DilOS60%D1.3.7
OpenIndiana58%F151a
Dyson58%F1327
illumian58%F1
Belenix0%IncompleteN/A
Non-Solaris Comparisons:
Fedora100%A21
FreeBSD88%B10.1

I've included the same criteria applied to Fedora 21 and FreeBSD 10.1 for reference only- their capabilities aren't intended to be a part of this analysis.

As we can see, the clear leaders under this evaluation are OpenSXCE, Joyent's SmartOS, and OmniTI's OmniOS, all with a "B" grade at 85%.  Next with a "D" grade are stock Oracle Solaris and DilOS.  OpenIndiana, OS Dyson, and Illumian are close behind those two, though still technically with failing grades.  Belenix could not be evaluated due to the main website being unavailable- it's apparently in the process of moving to a GitHub-hosted website, and the move is not yet complete.

While there are a few stars shining out in this bunch, to somebody who remembers the "golden days" of the OpenSolaris project, I can only reach one conclusion: the FOSS Solaris movement has been fractured.  The best contenders receive a "B" on the evaluation, and two of those three have significant backing from large IT companies.  It seems that no single OpenSolaris-derived project has the community backing required to bring it to premier status, and I'm concerned that without such support, the only FOSS System V derivative may be quickly headed towards fork-and-die oblivion...

03 February 2015

Write once, run anywhere: or, how to ruin a great idea

This is another one of those ideas in technology that sounds absolutely wonderful until you read the fine print.  The concept of "write once, run anywhere" was put forth by Sun (albeit with slightly different wording) in the late paleolithic, i.e. January of 1990, to coincide with the first public release of Java.

In theory, an application could be coded once, compiled into a platform-neutral executable form, then distributed as a bundle.  The "pre-compiling" of the application would also allegedly give significant performance improvements over an interpreted solution.  Here's where Java went horribly, miserably, and disastrously wrong, though- the language and compiled byte-code were platform-neutral, but the virtual machines in which they would eventually be run were decidedly not platform-neutral.

There are dozens of different Java virtual machines out there, from the stock Sun (now Oracle) "J2SE," enterprise "J2EE," and embedded "J2ME" to IBM's in-house Java VM used by applications like the Lotus suite, to OpenJDK which is part Sun / Oracle work and part clean-room reverse engineering.  Then, of course, there's the non-Java VM for which one writes in Java: Dalvik (the main runtime environment on Android devices until 5.0 'Lollipop').  Predictably, each of these supports a slightly (or drastically) different feature set in byte-compiled programs, to say nothing of differences between versions of the same VM.

At this point, you're probably asking yourself, "hey, why does Guy hate Java?"  The answer is straightforward- I don't.  The language itself has quite a bit to recommend it.  The language solution, however, is practically a perfect parable of how not to deploy a suite of application development tools.

So what's the alternative?  For client / server or distributed applications, develop a good cross-platform standardized, interpreted API for the client-side part, and leave the compiled code server-side.  For purely client-side applications, write them in an interpreted language (try Python).

JavaScript and HTML 5 are far from perfect, but they're at least heading in the right direction.  Implementation incompatibilities between browsers are shrinking weekly, and the HTML 5 ecosystem is moving towards a reasonably harmonious and tolerable state.  Yeah, you still have to account for IE vs. Firefox vs. Safari vs. Chrome in your code, but the differences are becoming fewer in number.  That's progress!  Our favorite WORA suite, in contrast, sinks farther into oblivion with each new release.  It's in such bad shape, in fact, that recent versions include a mechanism to automagically launch obsolete VM versions for specific sites (think IE compatibility mode for Java).  It uses deployment rulesets to establish URL <-> VM pairs for the Java browser plugin to reference.

In summary: if you're primarily a Java developer, don't go anywhere.  We do need and will continue to need you for years to come as we sort out this mess.  If you're in that category, however, and you're less than 50 years old, make sure to learn another language as well.  I hear that Fortran has a good market...

27 January 2015

Geeks vs. Normal People

Geeks react very differently to normal, sane people in some very important ways.  Use this handy guide to help you distinguish the two categories...

It's your first day on a new job, and you have no idea what to expect.  You sit down at your desk, and the computer screen looks like this:


Geek:
"This is going to be awesome!  Let's get some real work done."

Normal Person:
*goes bug-eyed* ... "I think it's broken."

The latest fruity gadget was just announced at WWDC:



Normal Person:
"That looks nifty.  When can I start using it?"

Geek:
"Great- another way that tens of thousands of people can have personal info stolen."

A [senior] relative offers to give you this laptop from the basement:



Geek:

"Wow that's awesome!  You're serious- I can have it?  I wonder if I can put OS/2 on it..."

Normal Person:
"Yeah, um... no thanks, I'm good."

You suggest that you and your buddy watch a movie:



Normal Person:
"Sure!  I saw this great preview on Netflix for one..."

Geek:
"Okay, let me fire up XBMC and you can browse the collection.  If there's nothing in there you like we can always torrent something."

The summary: geeks are special people, with special needs.  If you decide to befriend or fall in love with one, you had better a) be one yourself, or b) mentally prepare for the incessant stream of odd behavior.  Don't ever assume that anything will be normal.

24 January 2015

The Five People You'll Meet in Planning Committees

It's a scenario that many of us have faced at some point in our professional careers: you're in a meeting with coworkers or clients, and some seemingly-reasonable proposal is casually made by a relatively new coworker.  The firestorm that erupts would be enough to stun into silence even witnesses of the 1980 Mount St. Helens eruption.

When confronted with the reality of maintaining day-to-day operations in the face of a constantly evolving field, people react in different ways.  In an IT organization, recognizing these attitudes in co-workers and clients can mean the difference between success and failure of a project (and sometimes between life and death of your professional career).  Here are some of the types that I've encountered so far:


"The Wave Rider"

This individual is in love with the bleeding edge.  Whatever the latest and greatest gadgets, tools, tricks, and fads are, these folks are convinced that your organization must upgrade- regardless of whether they make sense, or are feasible in any real-world scenario.  You can recognize these individuals by their constant references to sites like Engadget, Mac Rumors, and Tech Crunch.
  • Pro: these individuals will be the prime mover for most improvements that are made
  • Con: sometimes those aforementioned movements happen to be off cliff edges
  • Nemesis: The Unretired

"The Greener Grasser"
Nearly always unhappy with the way things are currently being done, this is the certified professional complainer.  It doesn't matter how easy, simple, effective, or resilient the current site is; it doesn't matter how many dollars the replacement will cost- the status quo must change.  These wonderful creatures can be spotted a mile away by their near-incessant complaining about every aspect of your organization.
  • Pro: when these individuals stop complaining, you know you've got a truly winning product
  • Con: your annual earplug budget to deal with these folks will typically exceed any raises you get
  • Nemesis: The Tape Wrapper

"The Toll Troll"
Extortion is the name of the game that this ruthless cut-throat plays.  Any time an institutional purchase or policy change needs to be made, they somehow always manage to come out on top.  Need to deploy a new customer portal?  Be prepared to fund five new positions in HR.  Need to replace some aging network infrastructure?  You're going to have to remodel an entire floor first.  These folks can be spotted by their presence at meetings in which you'd think they have no vested interest, during which they talk of how inconvenient all of the changes will be to their departments / divisions.
  • Pro: once given the proper incentives, these folks will become your strongest allies
  • Con: over budget, every time
  • Nemesis: everybody else in the meeting, if united

"The Unretired"
Just as the undead are stuck between two natural states, so too is this individual.  They have given up all interest in anything but working through the day-to-day grind, and resent any changes in routine, which they deem utterly superfluous and counterproductive.  They can be picked out of a crowd quite handily by the the immensely annoyed face that they present whenever large-scale changes are discussed.
  • Pro: a veritable fountain of views both pragmatic and skeptical
  • Con: rarely, if ever, say anything but 'no' to new ideas
  • Nemesis: The Wave Rider

"The Tape Wrapper"
To say that this person has a singular obsession with rules would be like saying that Mozart was fond of music.  No project, undertaking, or visit to the restroom would be complete without a thorough recitation of institutional policy.  These odd characters can be spotted by sounding eerily cheerful when starting sentences with the words, "but our policy says that..."
  • Pro: can be relied on for a fair, unbiased, balanced, and considered opinion
  • Con: throw your project schedule out of the window
  • Nemesis: The Greener Grasser

As always, I'd like to know what you think.  Do you know which type you are?  Can you think of any additional types that should be on the list?

20 January 2015

An Afternoon Stroll with DTrace

So, for those of you who haven't heard, DTrace is a tool that's been around since late 2003, but has recently started gaining popularity.  It's an end-to-end instrumentation framework for system debugging and performance analysis, which allows developers to gain a complete picture of the environment in which an application is being deployed, from the kernel all the way down to individual library function calls.

To utilize it, you start with an OS kernel that has DTrace support.  At the moment, that means Solaris, FreeBSD, NetBSD, and Mac OS X, with an experimental Linux port well underway.

Then, you get your compiled / interpreted languages and dependent libraries with DTrace support.  It's available for just about every language used in client apps and server-side scripting today, with the merciful exception of that awful "dot" stuff.  There are also quite a few full application suites with end-to-end support as well, notably Firefox.

Once you're working with a complete toolset, you use the dtrace command to load scripts (written in the 'D' programming language, hence the name) that describe the things you want to know and, optionally, what action(s) to take based on that information.

For example, you might ask DTrace, "which of my application's threads take the longest CPU time to execute, and within those threads, which syscall in particular consumes the most time, and what's blocking that syscall that it takes so long?"  Such information is immensely useful- it can tell you about e.g. deadlocks between otherwise-independent threads in your application that want disk access at the same time, or resource conflicts between your application itself and other tools on which it depends.  Such information is virtually impossible to resolve with conventional debugging and tracing tools.

So, just how usable is DTrace?  Since I'm an avid FreeBSD user myself, and Erlang is my poison of choice for application development, I decided to give that combo a try.

First off, you need to have a FreeBSD kernel with DTrace support.  This has been the default in GENERIC kernels on Tier 1 platforms since 7.1, iirc.  If you're on a custom kernel, instructions can be found here.  It's worth noting that on -CURRENT, as usual, your mileage will almost certainly vary (and in odd and confusing ways).  Once you've got an appropriately-configured kernel, you still need to actually load the DTrace modules.

Second, you need an application and/or language suite with DTrace support.  Erlang staight from ports has this capability, but it's not enabled in the default build (so ditto binary pkgs).  Grab an up-to-date ports tree, cd /usr/ports/lang/erlang; sudo make config and turn on DTrace.  Build and install the port with the usual process.  If you're going with a language suite here, then you also need an application to run with it.  I just used my own (very immature, pre-alpha) EMiLE project as a crash-test dummy.

Finally, you need some probes to actually run.  There are plenty of demo scripts in
/usr/share/dtrace for 10.0 and later.  If you're running something earlier, you can get them with the sysutils/DTraceToolkit port.  I also found some demo goodies just for Erlang in this presentation.

Once you've got all of those things, you fire up your application, and run dtrace(1) with the appropriate arguments to run your command directly, or to point to your script file.  The results vary by your target, but are generally pretty amazing.  I discovered several resource concurrency issues and corrected them in about fifteen minutes thanks to DTrace, whereas I probably would have otherwise spent weeks hunting them down (if I found them at all).

The summary: apart from the headaches of using a new scripting language, if you're an application developer and you're not using DTrace, you're probably missing out on the 'deep view' of what your code is actually doing not just within its own confines, but to the hosting system at large.  It's definitely a tool that I'll be using going forward.

16 January 2015

Vexing the Snoops

There are a ton of interesting things in the recent congressional report on the NSA's rampant spying on the communications of the American public.  Now, like most normal folks, I don't have much of anything to hide, unless you count my obsession with cute cat photos or the time that I was really drunk and somehow thought that it would be a good idea to order workout-boosting supplements online from a sketchy "all-natural" vendor.  It took me two hours on the phone to cancel that junk...

But, I do believe that I have a basic right to privacy, and that exercising such a right at my discretion does not in any way form a legal justification that I have something to hide.  I have the right to have a private conversation with my friends without even having to wonder if some employee of the federal government will be reading that conversation at a later time.

To that end, I dug through the congressional report with the goal of finding out which privacy and anonymity technologies frustrate the NSA.  The short (and unpleasant answer): there is no single security technology available to typical consumers today that, alone, reliably protects against privacy intrusion.

There is hope, however.  Several technologies are mentioned in the report as being reasonably effective if deployed properly.  Furthermore, there are some basic strategies to ensure that your personal ramblings aren't the laughing-stock of the NSA water cooler.

Do Not:

  • Have any expectation of privacy when using major online services, like Google, Facebook, AIM, etc.  Just because you see that little padlock in your web browser doesn't mean your communications are safe.  In fact, the congressional report suggests that TLS alone, even with the latest protocols, is pretty much a joke, because the NSA has gained access to the data at the servers themselves.
  • Attempt to secure all communication.  If your communication at work or school, or between yourself and your doctor, is already covered by FOIA, FERPA, or HIPAA, then there's not much point in going security-crazy, as all of those already have legal access mechanisms built in.
  • Conduct any illegitimate commerce or business using ultra-secure communication mechanisms.  Uncle Sam gets really angry when he doesn't get his tax dollars, and that's likely to make you a priority target.  Oh, and don't be a terrorist- that'll do it too.  Once your sketchy commerce has been cracked open, which it almost inevitably will be, viz. Silk Road, your really embarrassing personal details are likely to follow suit.  In court.
  • Reveal extensive personal details about your life, identity, occupation, family, etc. to unknown persons, no matter how tempting it might be.
Do:
  • Take intelligent, informed steps to protect your personal, private, communication from prying eyes (more details on this to follow)
  • Feel free and unfettered to discuss whatever you want to discuss, even highly taboo or controversial subjects, provided that you take basic precautions.  The free and open exchange of ideas is the cornerstone of a free and open society, and fear of who might be listening should never hinder you in this regard.
  • Develop a close and trusted network of confidantes, with whom you trade secrets for secrets, dialog for dialog, idea for idea.

Now, the hot question of the day- how can one truly secure online communications in 2015?  Guess what- it is possible, but it's neither convenient nor fool-proof.  Convenience and security are mutually exclusive, and each time you augment one, you diminish the other.  That's the way it works- get over it.  To secure your online communications, the trick is simple: use layers.  In fact, layer like you're stepping out of a tent into an Antarctic Winter to go p*ss beside a frozen waterfall.

Examples of some good things to layer:
  • OTR (Off-the-Record) messaging: OTR can sit on top of virtually any other instant messaging protocol, including XMPP (Facebook, Google Talk, et al.), AIM, MSN, Yahoo!, and many more.  When used properly, it provides two-way trust relationships with your buddy, session encryption of messages, and perfect forward secrecy (which is really important in the age of supercomputers).
  • Tor (The Onion Router network): Tor is an anonymizing "hidden network" within the public Internet that, again, when used properly, allows two parties to communicate with reasonable security.  It's often abused to access public websites which store all sorts of unique data on you in other ways, but when your communication stays within the Tor network it's a reasonably effective tool, when layered with other mechanisms.
  • PGP (Pretty Good Privacy): it's a decades-old encryption technique utilizing public-key cryptography  that the recent report surprisingly suggests can still stymie snoops in some scenarios.  Like your house keys, however, you have to keep your PGP private keys secure, or all bets are off!
  • FDE (full-disk encryption) and/or FBE (file-based encryption): once an attacker has physical access to your computing devices, like your laptop, desktop, or mobile phone, all bets are off unless you're using comprehensive encryption.  Think that Windows or OS X login password alone protects you?  Think again- I can get past those in less than three minutes in most cases, and that's just as a casual effort.  If it's that easy for me, imagine how quickly an NSA snoop or a malicious hacker could do it, unless you pair a strong password with strong encryption... and even that's not a complete guarantee (hence the layering)
Using any two of those together is probably sufficient for most private communication- preferably combining a forward secrecy technology like OTR or Tor with a physical access protection like FileVault 2, Bitlocker, or eCryptFS.  If you want an extra layer of security, as a buffer, pick three.  Here are some examples, ranging from easiest to most difficult (note that all of these require your friends to have similar capability):

Easy: Turn on Bitlocker or FileVault 2, with strong passwords, and use OTR for your IM conversations with friends.  Make sure they have disk encryption enabled as well!

Moderate:
using Tor and Privoxy, create a free online e-mail account (make sure to set it up through Tor and only access it that way as well), and encrypt all the messages you send and receive with trustworthy PGP software, encrypting your private keys with strong passwords.

Difficult: Turn on FDE or FBE, set up your own XMPP micro-server with federation over Tor, and communicate with your friends' micro-servers via Onion hidden services, using OTR on your IM client(s).

So, to summarize, there's no such thing as guaranteed privacy.  Having said that, avoiding "high-risk activities," and using some of the privacy tricks mentioned above, can help ensure that your private communications stay... well... private!

15 January 2015

Protecting Sensitive Data in Erlang with Function Callbacks

Function callbacks are one of those programming tricks that, once discovered, seem difficult for one to live without.

The concept of a callback is simple: you create a function, "package" it in some way, and pass it as an argument to another function, to be executed wholesale at a later time.  Here's what the process looks like in Erlang:

Why would one do this?  The main reason in most languages is to be able to build a function's logic in stages, particularly when such stages cross different compilation units, with the final stage being the callback.  There are other reasons, though.  Erlang, in particular, has one added benefit: when callback functions are implemented as anonymous functions, then assigned to a variable, it becomes much more difficult for the callback function to be executed unintentionally.  An Erlang module can't export variables, only regular functions, so putting an anonymous function in a variable means that only other functions launched with the callback as an argument will ever know that it exists.

My personal favorite usage of callbacks in Erlang is to protect functions which should only be available to authenticated users.  It's also handy for building GUIs with e.g. wxWidgets (and Erlang's wx module).  I'd be interested to hear what other uses people have found for function callbacks; suffice it to say, though, that I'm hooked on them.

13 January 2015

Mind the Gap: why end users are constantly frustrated with tech support

It's a simple assertion- IT is not what it was a decade ago.  Like any profession or field of expertise, it's constantly evolving and changing.  In this particular case, it's growing, and rapidly.  According to the US Census Bureau's report on computer and Internet use, the number of US households with a computer has jumped by more than ten percent in the last decade, even as the population grew by twenty million people.  Of course, those census numbers aren't including today's smartphones in the study, many of which are more powerful than the typical desktop or laptop computers of 2003 anyway.  As if that's not enough, six of the thirty job fields projected to grow the most in 2016 are explicitly IT related.

But we have a big problem.  When IT was a reasonably compact field, it was straightforward for an individual to specialize in one particular aspect, but still to be reasonably well-versed in other areas as well.  One could be a network engineer in day-to-day duties, but still be a competent server or desktop hardware repair technician in a pinch.  As the market diversifies, that kind of cross-disciplinary knowledge is becoming less and less practical.  It's not that those in tech fields are getting lazy or complacent (though that does happen), but rather that there's just so much that it's impossible to keep up with it all.

Why is this a problem?  Consider another one of the premier and expansive professional fields with many parallels to IT: medicine.  In medical practice, you have individuals with basic, across-the-board knowledge (general practitioners and registered nurses).  If you have a problem that a "generalist" can't resolve, you go to a "specialist," e.g. an orthodontist, a podiatrist, etc.  That approach entails horrors in its own right, but let's assume for a moment that the model is sane.  What about IT generalists- who are they, and what can they do?  Think about it for a moment.  I'll wait...

Okay, so there's an answer, but it's not a pretty one.  Our IT generalists these days are the folks who staff help desks and call centers.  These individuals are required to have wide-reaching, general diagnostic knowledge across the whole IT spectrum.  They're expected to be able to diagnose and correct some of the most common issues, and are also expected to know when to pass the buck on to specialists.  Without these generalists, the specialists would be in an awful mess.  But the generalists are often some of the lowest-paid positions in IT, have some of the highest turnover rates and, to make matters worse, due to those other factors have very controlled and regulated interaction with those they're assisting.  That's particularly troubling when one considers that, like those in the medical field, those in IT are often treated with disdain, rudeness, and downright disrespect by those whom they serve.

Imagine for a moment that your family doctor(s) typically had less than five years of experience and rotated out every six months due to poor pay, bad hours, and/or moving on to more lucrative 'specialist' careers.  Now imagine that those temporary doctors only consulted with you by telephone, or for in-person appointments less than fifteen minutes.  If you're alarmed, that's good- it means we're on the right track here.  That's where your tech support is right now, and if you've ever been ready to lose your cool while trying to get tech help, that's the reason why.

The only solution immediately apparent to me, though there are almost certainly others, is to make the "IT generalist" a real, bona fide, reasonably-compensated profession with degrees, training, internships, and the rest to go with it.  It seems to me that until a few institutions grasp this basic concept, "big IT" is heading towards a cliff edge in a car with no brakes.

12 January 2015

fun() -> functional end.

There's something special about having obscure, dust-covered tidbits of knowledge.  Like a collection of rare books (though much easier to move) or an exotic flower garden (though requiring less water), collections of the mind tend to bring much joy and diversion.  The same principle, it seems, must apply to programming languages.  After years of working with more traditional languages like C / C++ and Fortran, and then delving into the exotic but almost useless ones like Forth and m68k assembly, practical circumstances finally forced me to learn a programming language that is slightly less confined to particular niches (when did C get that way?)... perhaps even a bit, dare I say, bourgeois.  Going too mainstream, of course, would be unthinkable, so I settled on Erlang.  That was more than six months ago now, and I haven't looked back.  The project, by the way, was an laptop encryption status reporting tool for Linux, and a corresponding web application for viewing the information.  There are plenty of them out there for Windows and Mac OS X, but none for our favorite FOSS platform.

What's so special about Erlang, you might ask?  Well, here are the highlights...

  • It relies heavily on the functional paradigm (more on that later)
  • There's built-in concurrency and message passing - no extra libraries needed
  • Typically, Erlang is compiled to byte-code, but can also be interpreted
  • It runs in a VM (comparing this to Java would be a disservice, let's avoid that)
  • The code of an module can be changed while the module is running, provided that a few basic rules are observed.
  • It's notoriously difficult to break and/or compromise, again provided that a few basic rules are observed
What is it particularly good at?  It's great for moving data from point 'A' to point 'B' based on intelligent, reliable, and rich decision-making.  It was originally created in the mid-80s to power telephone network switches.  Today, it shines as an integral component in products like ejabberd, WhatsApp, Facebook Messenger, Call of Duty, and Chef, to name just a few.

What is it particularly bad at?  Erlang is flat-out terrible at advanced manipulation of binary data.  Things like image and video processing, emulators, heuristic file scanning are out of its purview.

So, it's useful and has some big names associated with it, but is unknown to the general populace.  Perfect!  But where should one start?  The Erlang web site is fairly helpful, and will get you going in terms of getting the programming environment up and working (if you're on just about any GNU/Linux distro, it's probably already available for download from the normal software repo).  But that doesn't address what to do with the language.  If you like the modular, discrete, "learn it one small step at a time" approach, then exercism is for you.  In my case, I prefer the "sink or swim" approach, so after using the online version of Learn You Some Erlang for great good! as a primer, I started writing a simple demo point-of-sale system.  I wanted menus, tickets, and receipts.  And that's when the proverbial sh*t hit the fan...



Part of Erlang's magic is called referential transparency.  This means that any given expression (an action taken) could be replaced with its value (what it 'returns') without changing the way that the program works.  When referential transparency is maintained, Erlang is deterministic.  This is what allows hot code swapping to work, and also contributes to the system's message passing power.

...but if you're coming from imperative programming, it's your worst nightmare!

Take a simple 'for loop' in C:
for ( x = 0; x < a; x++ ) { printf( "Doing something...\n", x ); }
The concept is simple: you want to do something a variable number of times.  But because the value of x changes as the loop runs, and x could be used before or after the loop (try setting it as, e.g. a void and watch the hilarity!) this breaks referential transparency.  For this reason, in Erlang, 'variables' (which aren't actually variable) cannot have their value changed once defined, in any particular context.

Instead, to accomplish this same goal, you have to back up on your chain of reasoning, and ask what it is that you're trying to accomplish.  To just do the exact same thing as above, preserving referential transparency, in Erlang you might instead say:
myMessage( a ) -> io:write( string:copies( "Doing something...\n", a ) ).
In this example, one would simply call myMessage (a function) with the number of times a as the first argument.  This preserves referential transparency, i.e. any instance of myMessage( N ) could be replaced with the output value (which in this case would be the exit value of the io:write function, the atom ok).

Most probably, however, if you're using a 'for loop' in C, you're trying to do something with an array.  If you back up far enough on your chain of reasoning in Erlang, most such actions can be very quickly represented as list comprehensions, which (as the name suggests) build one list by processing one or more others in some way.  Say you want to convert a string, which is just syntactic sugar for a list and works the same way in Erlang, to uppercase... but remove all spaces.
[ string:to_upper( char ) || char <- "just a simple test", char /= 32 ]
In this case, you (the programmer) don't need to know how long the list is, or walk through it one item at a time- just tell Erlang how to process it, and let it go!

Now we've covered referential transparency, the pros and cons of Erlang, and how to go about getting started in it.   Soon I'll talk about the message passing and concurrency parts, and eventually get into anonymous function callbacks, one of my favorite Erlang tricks.

The Obligatory "Inaugural Post"

What brought this calamity on, you ask?  It's simple: learning.  As a seasoned IT professional with more than a decade of experience, I frequently encounter situations which require unorthodox solutions, highlight social issues, or are just downright comical.  Most of the time, all three categories apply.  Since it seems a shame to let all of these things go to waste, and I'm not sadistic enough to actually make people listen to these stories in person, I offer this collection for you to peruse at your own leisure.  Enjoy!