SVMSched: a tool to enable On-demand SaaS and PaaS on top of OpenNebula

SVMSched [1] [2] is a tool designed to enable on-demand SaaS clouds on virtualized infrastructures, and can also be easily set up to support PaaS clouds. SVMSched can be used to build cloud platforms where a service is deployed to compute a user-given dataset with a predefined application based on a given hardware requirements (CPU, memory). In such a context, SVMSched seamlessly and automatically creates a custom virtual computing environment to run the service on-the-fly. Such a virtual computing environment is built to start the execution of the service at the startup, and is automatically destroyed after the execution, freeing up allocated resources.

Benefits of SVMSched

  • Configuration-based On-demand Cloud Services: A SVMSched cloud is based on a single configuration file in which you define a set of software services that you wish to provide from your virtualized infrastructure. This configuration file also supports parameters to connect to the OpenNebula server, scripts and data necessary to automatically build virtual environments to run services, etc.
  • Automatic provisioning and high-level abstraction of virtual machines: After deploying SVMSched in your cloud infrastructure, you don’t longer need to manipulate virtual machine templates. Actually, to run a service you only need to make a simple request in the form of “I want a virtual machine with 4 CPUs and 512 MB of memory to compute a given set of data with a specific application”. Then, SVMSched does the rest for you (prepare the virtual machine’s image, instantiate the virtual machine, deploy and start the virtual machine on a node it selected seamlessly, start the execution of the service within the virtual machine, shut down the virtual machine when the execution is completed).
  • Scheduling: SVMSched enables advanced scheduling policies such as task prioritization, best-effort along with automatic preemption and resuming (plus migration, where required), resource sharing, etc.
  • Remote Data Repository: SVMSched is designed to allow you to define shared data repositories on the network that can be mounted automatically within the file system of virtual machines at startup, before starting the execution of service. Such a repository can be useful to store binaries, and any other data required by the compute tasks. It thus provides a mechanism to ease the handling of input and output data. Hence, you can avoid handling large virtual machine images (requiring large time of setting up), while minimizing the risk of losing data and computation already done if a virtual machine failed unexpectedly.

Integration Architecture

The figure below shows the architecture for integrating OpenNebula with SVMSched. In brief, SVMSched :

  • Works as a drop-in replacement for the OpenNebula’s default scheduler (mm_sched).
  • Enables a specific socket interface managed by a listening daemon. The socket works over the IP protocol, thereby enabling the possibility to have remote clients.
  • Enables a built-in UNIX-like command line client. The client can be located in a different server than the SVMSched daemon.
  • Communicates with OpenNebula through the XML-RPC interface. SVMSched and OpenNebula can be hosted on different servers.
  • Relies on a single XML configuration file; Not need to manipulate virtual machine templates.

SVMSched Integration Architecture

Use cases

Without being exhaustive, these are some situations where SVMSched can bring you significant added values.

Automatic deployments for on-demand PaaS/SaaS services

Typical contexts are: executing services based on computational applications (data/input => processing => results/output), resource/platform leasing, etc. Software testing (validation testing, non-regression testing, etc.) is a typically example. In such a context, the infrastructure behaves as a dynamic virtual cluster, in which virtual machines are created and deployed on-the-fly for specific and limited lifetimes after which they disappear.  Each virtual machine has a specific/custom configuration (software stack, amount of CPU, memory size). After its lifetime, depending on (determined by) the time required to run the service, the virtual machine is automatically destroyed to free allocated resources. The following points explain the few things you need to set up such a cloud :

  1. Define one or more services in the SVMSched’s configuration file according to your needs. E.g. a service can consist in running a specific unit test script.
  2. If necessary, set up a data repository (a shared network file system) in which binaries and data required to run services will be located. We recall that SVMSched enables to mount this repository automatically into the file system of virtual machines.
  3. Finally, running a service is a straightforward task. For example, the following command allows you to run an instance of the service named “example-service1” using a virtual machine having 2 CPUs and 1024MB of memory. In the example, we assume that the input data is located in /data/repository/file.dat,  specified with the -a option.
  4. $ svmschedclient [-H svmsched@server=localhost] --vcpu=2 --memory=1024 \
              -r <example-service1> -a /data/repository/file.dat

On-demand Infrastructure for Training

See here for example. In such a situation SVMSched can be especially useful to avoid setting up multiple templates of virtual machines manually, while being able to create virtual machines with various hardware and software configurations.  Indeed, this can be time-consuming. For example, assume that you have to deal with several trainings, each requiring a practical session (e.g. Parallel programming, Web application deployment, etc.). It appears evident that the software and the hardware requirements of virtual machines need for the different practical sessions can vary considerably, and may require to set a lot of virtual machine templates. You may also need that at the end of each practical session (given by a duration), virtual machines be destroyed automatically. Using SVMSched, only four straightforward things are needed to set up such an infrastructure:

  1. Define each practical session as a service in the SVMSched’s configuration file.
  2. For each service, set a data repository in which specific software binaries and libraries and data required for that practical session will be located. We recall that SVMSched enables to mount this repository automatically into the file system of virtual machines.
  3. For the main program (executable), use a simple script that enforces a sleep for a given duration.
  4. Finally, for each student who should attend at a given session you only need to request a virtual machine with specific hardware requirements (memory and CPU), for a given duration. The example below show how to create a virtual machine with 2 CPUs, 1024 MB of memory and a lifetime of 3 hours.  HINT: If all virtual machines need the same the requirements, you can use a loop according to the number of attendees.
  5. $ svmschedclient [-H svmsched@server=localhost] --vcpu=2 --memory=1024 \
              -r <training-service-id> -a 7200

Co-hosting of production and development services

A typical case is when you want to use idle resources of a production infrastructure to carry out some development tasks such as software testing (init tests, Non-Regression Testing or NRT, etc.). SVMSched allows you to distinguish production tasks (prioritized and non-preemptable) to best-effort tasks (non-prioritized and preemptable).  So, when operating, SVMSched can automatically preempt best-effort jobs when there are not resources available to run queued production tasks. Preempted jobs are automatically resumed as soon as resources become idle. The decisions of preempting and resuming are took autonomously. Assuming that you already set up a SVMSched cloud, the following commands show how to run two jobs in production and best-effort modes, respectively.

$ svmschedclient [-H svmsched@server=localhost] --vcpu=2 --memory=1024 \
          -r <prod-service-id> -a /data/repository/file1.dat [-t prod]
$ svmschedclient [-H svmsched@server=localhost] --vcpu=2 --memory=1024 \
          -r <nrt-service-id> -a /data/repository/file2.dat -t beff

Conclusion

SVMSched (Smart Virtual Machine Scheduler) is a tool designed to enable and ease the set-up of on-demand SaaS and PaaS services on top of OpenNebula. SVMSched is open source and available for free downloading [1]. However, SVMSched is still at a development stage, not yet production-ready.  Being an ongoing project, feedbacks and collaborations are appreciated. So, don’t hesitate to contact authors if you have questions, suggestions, comments, etc.

References

[1] SVMSched Home. https://gforge.inria.fr/projects/svmsched/

[2] Rodrigue Chakode, Blaise-Omer Yenke, Jean-Francois Mehaut. Resource Management of Virtual Infrastructure for On-demand SaaS Services. In CLOSER2011: Proceedings of the 1st International conference on Cloud Computing and Service Science. Pages 352-361. Noordwijkerhout, Netherlands, May 2011.

Image Creation and Contextualization Guide

C12G has created an Image Contextualization Guide to give guidance on how to create and configure a VM Image to work in the OpenNebula environment. The new guide proposes techniques to create a VM Image from scratch and to prepare existing images to run with OpenNebula.

This article is part of the new Knowledge Base that is being extended by C12G Labs.

OpenNebula Scalability Guide

C12G has created a Scalability Guide to give guidance on how to install and tune OpenNebula for optimal and scalable performance in your environment. The software comes with several modifiable parameters that can to be adapted to the specific needs of your infrastructure and workload.

This article is part of the new Knowledge Base that is being extended by C12G Labs.

OCCI 1.1 for OpenNebula

A recommendation for version 1.1 of the Open Cloud Computing Interface (OCCI) was recently released by the Open Grid Forum (OGF) (see OGF183 and OGF184). To add OCCI 1.1 support for OpenNebula, we created the Ecosystem project “OCCI for OpenNebula”. The goal of the project is to develop a complete, robust and interoperable implementation of OCCI 1.1 for OpenNebula.

Although the project is still in an early stage, today we released a first version that supports creating and deleting Virtual Networks, Images and Machines. Information on installation and configuration of the OCCI 1.1 extension can be found in the Wiki of the project.

Florian Feldhaus, Piotr Kasprzak – TU Dortmund

Integrating Public Clouds with OpenNebula for Cloudbursting

C12G has created an introductory article to describe how to integrate public clouds with OpenNebula for Cloudbursting. The white paper describes the integration of public clouds with private cloud instances running OpenNebula. A general provisioning scenario that combines local and external cloud resources is first described. Afterwards the architecture of OpenNebula and the main components involved in a hybrid cloud setting are briefly presented. The document ends with some considerations and the minimum requirements to deploy a service in an hybrid cloud.

This article is part of the new Knowledge Base that is being extended by C12G Labs.

OpenNebula IRC Sessions

The OpenNebula Team is happy to announce the new OpenNebula IRC Sessions. In these sessions the OpenNebula developers will be available for questions in the #opennebula IRC channel on irc.freenode.net. The developers will answer questions about the new features or development and configuration issues that cannot be found in the mailing list archive.

These sessions will usually be scheduled in the first week of each month.

First session: Monday, 9 May 2011, 15:00 UTC

The attendees will need an IRC client connected to irc.freenode.net and the #opennebula channel

Extending the Monitoring System

C12G has created a new article to describe how to extend the OpenNebula monitoring system. OpenNebula needs to monitor the physical resources known to the system in order to extract information that in turn is used by the scheduler to enforce (and comply with) placement policies, keeping the host capacity from being overbooked.

The Monitoring System in OpenNebula follows the design guidelines present in the rest of its architecture, including the modularity present in other components. This modularity is expressed in this case in a plugin approach, where the information is extracted using ‘probes’, basically simple scripts that return pieces of wanted information.

This new howto shows how to extend the Monitoring System to extract information about disk space availability, and then use it in the Virtual Machine templates to ensure the presence of enough disk space for the image in the chosen host to run that Virtual Machine.

OpenNebula shared storage with MooseFS

When running many VMs with persistent images, there is the need to have a shared storage behind OpenNebula hosts, with the purpose of faster recovery in case of host failure. However, SAN are expensive, and an NFS server or NAS can’t provide either performance or fault-tolerance.

A distributed fault-tolerant network filesystem takes easily place in this gap. This alternative provides shared storage without the need of a dedicated storage hardware and fault-tolerance capabilities by replicating your data across different nodes.

I am working at LiberSoft, and we evaluated the usage of two different opensource distributed filesystem, MooseFS and GlusterFS. A third choice could be Ceph, which is currently under heavy development and probably not so production-ready, but it certainly would be a good alternative in the near future.

Our choice fell on MooseFS because of its great expandability (you can add how many disks you want, any size you prefer) and its web monitor where you can easily check the status of your shared storage (replication status or disk errors). So we published on the Ecosystem section a new transfer manager and some basic instructions to get it working together with OpenNebula.

We had promising results during the testing deployment of 4 nodes (Gateway with 2x Xeon X3450, 12GB ram, 2x2TB SATA2 disks) for a private cloud at National Central Library of Florence (Italy), that will grow hence most Windows and Linux servers will get on the cloud in the next few months.

The requirements on this project were to use ordinary and affordable hardware and open-source software to avoid any possible vendor lock-in, with the purpose to lower energy consumption and hardware maintenance costs.

OpenNebula applying to be a GSoC 2011 mentoring organization

After our successful participation in last year’s Google Summer of Code, the OpenNebula project will once again be applying to be a mentoring organization in Google Summer of Code 2011.

As part of our application, we are compiling a list of possible student projects for this summer. We’d like to encourage members of the OpenNebula community to suggest project ideas and to volunteer to mentor students this summer. If you have an interesting project idea, or would be interested in mentoring an OpenNebula student project this summer, please send a message to our mailing list. Please note that the application deadline is March 11th, so we need to collect all project ideas before then.

OpenNebula packages available in openSUSE

We are happy to announce that an OpenNebula project has been created in the openSUSE Build Service. We would like to thank Robert Schweikert, Peter Linnell and Greg Freemyer for their efforts and for taking the time to answer our questions in their mailing list.

If you have the time to test it or want to help them improve it, please send your feedback to the openSUSE Packaging mailing list!