Haizea 1.0 Beta 1 now available

A new version of the Haizea Lease Manager was released a few days ago and is available for download at http://haizea.cs.uchicago.edu/

Haizea can be used as a drop-in replacement for OpenNebula‘s scheduling daemon, providing OpenNebula with more advanced scheduling capabilities such as advance reservations and queueing of requests when there are no resources available. The latest version of Haizea is the first of two betas on the road to Haizea 1.0. This beta includes several major new features:

  • Support for pluggable scheduling policies: Whereas previous versions of Haizea used hardcoded scheduling policies (e.g., to determine whether a lease could be accepted, whether it should be preempted, etc.), this version allows users to choose between different policies, or to write their own custom policies that they can “plug into” Haizea. Writing a custom policy only requires writing a Python module implementing a set of methods described in the Haizea documentation.
  • Support for heterogeneous resources and arbitrary resource types: Previous versions of Haizea required that each lease be composed of homogeneous nodes (i.e., all the VMs in a lease had to request the same amount of resources), and only supported five resource types (CPU, Memory, Disk, Inbound network and Outbound network). This version supports leases with heterogeneous resource requirements and allows users to define arbitrary resource types that can then be requested by leases.
  • New LWF (Lease Workload Format): Leases are now described using a new XML format that, unlike the previous LWF, supports specification of heterogeneous nodes and arbitrary resource types.
  • OpenNebula 1.4 support: The OpenNebula enactment module has been updated to support OpenNebula 1.4. The new enactment module uses OpenNebula’s XML-RPC API, instead of accessing the OpenNebula database directly.
  • More unit tests: Besides the existing trace-based unit tests (which run simple tracefiles with Haizea), new unit tests have been written to make sure that Haizea’s core data structures, like the slot table and the resource mapper, are working correctly.
  • Pydoc documentation: Most of Haizea’s modules are now fully documented using Pydoc. A browsable HTML version is provided on the Haizea website for developers who want to write their own custom policies.
  • New project management site: The PhoenixForge site where Haizea is hosted has recently migrated to a Redmine site (from the previous Trac-based Dr.Project site). This new site provides a much better web interface and several other project management features.

Please note that the current version is still a beta, and not suitable for use in production environments. Its main purpose is to get feedback from the community so, if you encounter any bugs or there’s any particular feature you’re interested in, please don’t hesitate to let us know on the Haizea mailing list: https://mailman.cs.uchicago.edu/mailman/listinfo/haizea

Haizea and Private Clouds

The latest version of the Haizea Lease Manager (Technology Preview 1.3) was released a few days ago, so this seems like a good opportunity to talk about why Haizea exists and what it means to OpenNebula users.

As most readers of this blog know, OpenNebula allows you to manage the dynamic deployment of virtual machines (VMs) on a pool of physical resources. There are many reasons why you would want to virtualize your infrastructure (see the OpenNebula use cases at the bottom of this page), and the one I will focus on here is creating a “private cloud” (a subject that has been discussed previously on this blog).

Since we’re entering the perilous terrain of buzzwordiness, let me stop for a second to clarify that whenever I use the term “cloud” in this post, I specifically mean an “Infrastructure-as-a-Service (IaaS) cloud”, such as Amazon’s EC2, where computational infrastructure is provisioned on-demand as virtual machines on a large data center. Yes, I realize “cloud” can and does mean many other things (although we’re still far from agreeing on what it means) but, for now, let’s stick to the IaaS aspect of clouds.

One of the the characteristics that is frequently attributed to clouds is that of “infinite capacity”. Thus, large cloud providers like Amazon EC2, Flexiscale, and ElasticHosts have evolved towards an immediate provisioning model: when users asks for additional capacity, they get it, subject to some reasonable limitations (there may be a delay in setting up the extra VMs, providers may have limits on how much capacity one single user can request, etc.) If you assume infinite capacity, this provisioning model is pretty reasonable. There is no need to, for example, allow users to make reservations in advance: if you need resources from 2pm to 4pm, why would you need to reserve them? Just show up at 2pm and they will be there for you, because capacity is “infinite”.

Of course, there is no such thing as “infinite capacity”, but large cloud providers can at least provide the illusion of infinite capacity. However, if you have a relatively small number of resources (compared to Amazon or Google) and want to build a “private cloud” with OpenNebula on top them, you’re probably in no position to assume you’ve got infinite capacity.

But why would you want to create a “private cloud”? Isn’t outsourcing your infrastructure to large external providers (instead of keeping them in-house), thus reducing your IT expenses, one of the biggest selling points of cloud computing? Sure, but some of us will still have our own IT infrastructure to manage and, although datacenter virtualization has been around since before “clouds” became “the next big thing”, there are certain benefits to managing your infrastructure like a “private cloud”:

  1. You can provide your in-house users with all the benefits of deploying their machines on EC2, without actually paying EC2 to do it. My intuition is that, if you already have an IT infrastructure that is mostly amortized, this will make sense financially (unlike a business with no IT infrastructure, where relying on a large cloud provider makes more sense than making a huge initial investment on new infrastructure). That said, I will be more than happy to be corrected on this point, as it is simply an intuition.
  2. You can become a cloud provider. Having a “private cloud” doesn’t preclude the possibility of adding a public interface, using tools like Nimbus or Eucalyptus, and turning all or part of your private cloud into a public cloud that can be accessed by external users via the Internet.
  3. All of the above. If you’re servicing in-house users, it’s almost certain that your infrastructure will be underutilized some of the time. This unused capacity could be sold to external users.

This is all nice and dandy but, as I said earlier, a private cloud can’t assume it has “infinite capacity”. Thus, relying on an immediate provisioning model just doesn’t hold water. Requests for resources are going to have to be prioritized, queued, pre-reserved, and even rejected. Tools for building private clouds will need to support more sophisticated resource scheduling than just immediate provisioning, and this is where Haizea comes in.

Haizea is a lease manager that can be used as a drop-in replacement for OpenNebula’s scheduler, providing scheduling features not found in other cloud and virtualization solutions, such as efficient support for advance reservations, queuing of best-effort requests and, coming soon, pluggable scheduling policies. While still supporting an immediate provisioning model, Haizea also allows OpenNebula users to pre-reserve resources (in anticipation of capacity peaks) or queue requests that can afford to wait a while (another feature that will be added to Haizea in the future is best-effort scheduling with deadlines, so there will be a finite bound on the waiting time). Again, if you have a datacenter of Amazonic proportions, Haizea probably makes no sense. But if you have a more modest datacenter, and want to build a private cloud on it, you will need to be more judicious about how you slice up your resources amongst users (and I suspect that most of us fall into the non-Amazonic category).

To wrap this up, I’d like to refer to a technical report that has been getting quite a bit of press lately, “Above the Clouds: A Berkeley View of Cloud Computing“. This report has been getting mixed reviews and, personally, I can’t say I agree with many of the things they say, particularly the way they dismiss private clouds right from the outset. However, I think they raised a good point in “Number 5 Obstacle [for Cloud Computing]: Performance Unpredictability”, where they stated:

The obstacle to attracting HPC is not the use of clusters; most parallel computing today is done in large clusters using the message-passing interface MPI. The problem is that many HPC applications need to ensure that all the threads of a program are running simultaneously, and today’s virtual machines and operating systems do not provide a programmer-visible way to ensure this. Thus, the opportunity to overcome this obstacle is to offer something like “gang scheduling” for Cloud Computing.

Haizea, in fact, has supported VM gang scheduling from day one. The lease abstraction used in Haizea allows users to request not just individual VMs, but groups of VMs that must be treated atomically. In other words, VM that must either all be running simultaneously or not at all (which involves gang-scheduling those VMs)

So, if you’re interested in virtual machine scheduling that goes beyond immediate provisioning, I invite you to check out Haizea. It’s still a technology preview, but it’s being actively developed and a 1.0 release shouldn’t be too far off.

Borja Sotomayor

Haizea Technology Preview 1.3 Released

Borja Sotomayor has just announced the release of a new version of the Haizea Lease Manager. Technology Preview 1.3 now includes support for OpenNebula 1.2 (released one week ago), and enhanced stability and robustness. This is a new step towards TP2.0, which will include a policy engine and several novel scheduling features. The detailed list of changes is available in the project changelog.

Ignacio Martín Llorente

OpenNebula on the List of Top 100 Players in the Cloud Computing Ecosystem

SYS-CON’s Cloud Computing Journal has just expanded its list of most active players in the fast-emerging Cloud Ecosystem. The list includes the most active cloud players, which are driving the most Enterprise-relevant innovation. Cloud computing is an opportunity for organizations to implement low cost, low power and high efficiency systems to deliver scalable infrastructure.

Ignacio Martín Llorente

The Emerging Ecosystem of Cloud Components

MIT Technology Review has just published an interesting article entitled “Openning the Cloud” about open-source technological components to build a cloud-like infrastructure. The article focuses on the IaaS (Infrastructure as a Service) paradigm, describing the components required to develop a solution to provide virtualized resources as a service. The article briefly describes the following technologies: OpenNebula, Globus Nimbus, and Eucalyptus.

In the OpenNenula project, we strongly believe that a complete Cloud solution requires the integration of several of the available components, with each component focused on a niche. The open architecture and interfaces of the OpenNevula VM Manager allow its integration with third-party tools, such as capacity managers, cloud interfaces, service adapters, VM image managers…; so supporting a complete solution for the deployment of flexible and efficient virtual infrastructures. We maintain an Ecosystem web page with information about third-party tools to extend the functionality provided by OpenNebula.

Ignacio Martín Llorente

First Technology Preview of the Haizea Lease Manager

I would like to give a warm welcome to Haizea to the virtualization ecosystem. The new technological component is an open-source VM-based lease management architecture, which can be used

  • As a platform for experimenting with scheduling algorithms that depend on VM deployment or on the leasing abstraction.
  • In combination with the OpenNebula virtual infrastructure manager, to manage a Xen or KVM cluster, allowing you to deploy different types of leases that are instantiated as virtual machines (VMs).

Its full integration with OpenNebula will be part of the next Technoloy Preview (TP1.1), due mid-july. Haizea is being developed by Borja Sotomayor, a PhD student at the University of Chicago, who is now visiting our research group partially funded by the European Union’s FP7 Reservoir project (“Resources and Services Virtualization without Barriers”).

Ignacio Martín Llorente