Thursday, February 2, 2012

Creating a Virtual Appliance with Karaf - Part 2

The last few months have been rife with decisions, hard work, and ultimately led to a number of good things including a new Virtual Appliance containing a fully pre-configured software development environment consisting of applications that are fully consistent with the Apache Software License 2.0! Completely open-source, not gimped, fully functional and best yet, fully configured.
First, a couple of decisions had to be made.  Instead of working inside of a full-blown cloud as I originally proposed, I decided that it would save time to target a specific virtualization technology: VMWare's VMPlayer. This technology was chosen because it is free to use lowering cost barriers for new developers.  Second, For an IDE, I spoke with a number of my open-source colleagues and chose IntelliJ's Community Edition.  Next, I had to decide what operating system to use, and I chose CentOS. How should I distribute this new VM?  That's the sticky part. To help manage the creation and distribution of this VM, I created a small open-source company called Atraxia Technologies. Unfortunatley, this really doesn't solve the problem of how to distribute it. For reasons that I'll explain later, I still haven't gotten an answer for that.

Lastly, I needed to decide what the purpose of the VM should be. Sure, creating a virtual appliance is a fun thing to do. But, if it doesn't have a clear purpose, nobody is going to use it.  So, after talking to my fellow open-source developers, I decided that my first Virtual Appliance would be a fully-configured software development environment.  Many  of my friends and I have a number of new software developers we mentor. Unfortunately, a large amount of time is needed to get these developers' environments configured, reducing the amount of time we can spend helping improve thier software development skills. Having a VM they can install by themselves will greatly configuration time.

The virtual appliance was pretty easy to set up.

First, I downloaded and installed VMWare VMWorkstation. This is a $200.00 product, but it made the creating of the VM pretty easy, so it was worth the cost.  Once the VMWorkstation was installed, I downloaded and installed CentOS 6 into it.  Again, this was pretty easy. The login and password are both "blue". Next I installed the IDE, the 1.6_22 compatible version of OpenJDK, git, subversion, maven 2.2.1 and 3.0,  and Nexus.

Why install nexus? Well, from my experience there are cases when a build is halted prematurely which results in certain maven metadata files becoming corrupted. In those cases most developers will simply delete their /home/blue/.m2/repositories directory instead of attempting to find the corrupted file. However, in a wireless environment blowing away your repository can result in a very long build time because maven will have to wirelessly re-download all of the libraries and then rebuild the repositories directory.  To fix this, each VM comes with its own preconfigured nexus repository and the .m2/setting.xml file is written to only pull files from the local Nexus.  The local Nexus, in turn points to the global Maven Repository, Codehaus, and a couple of other public repos.

The Nexus repsitory will cache all of the files that maven needs to build. The first time you build an application, it may take some time to populate the nexus repository. However, after that, even if you have to blow away your maven repo, it will take a very small amount of time to rebuild your application and your maven repo.

Now on to the JDK.  This is where things got sticky.  Despite all of hard work of the OpenJDK team, there are still some applications that won't build with it. And, due to licensing restrictions from Oracle, I can't bundle the Oracle JDK with each VM. So, to test the VM, I had to install the Oracle JDK, compile my test application (Apache Karaf), and then uninstall the JDK and point the path and JAVA_HOME environment variables back to OpenJDK.  OpenJDK is a perfectly fine JVM for most developers. But for folks developing applications like Hadoop, Accumulo, etc, the Oracle JDK is really necessary.  If you need it, it is free to download and use, but not free to distribute.

So, back to distributing the VM.  Currently, the VM is completed. However, I'd have to pay $$ in order to get the 3 Gigabyte file hosted.  As such, it is sitting on my hard-drive awaiting some generous donor of bandwith to host it. The VM is called "Atraxia Blue" and it is the first of three planned virtual machine offerings. This one's intent is to be a desktop development environment. The next one will take the place of the software hub used by most development teams to host thier ticketing system, Sonar, and central nexus respository.  I'm still researching whether I should also a central Git repository also.  This offering will be called "Atraxia Sienna".  The last VM I will produce will include open-source office automation software and some open-source back-end business tools that are being developed by the Apache Software foundation. This one will be called "Atraxia Pointy-Hair". I'm open to a different name though, if someone wants to suggest one.

The final goal in all of this is to have a suite of completely open-source virtual appliances a small team or company can use to stand up a complete business, fully configured out-of-the-box. 

Oh and thanks to my daughter for coming up with a new motto for Atraxia Technologies: "The Cloud, only Fluffier". 

Until next time!