The recently announced beta release of OpenNebula 3.0 includes a new OpenNebula Zones component that brings support for building multi-tier cloud architectures consisting of multiple OpenNebula instances (zones) and for defining Virtual Data Centers (VDCs) within each zone. In this article we elaborate on the VDC functionality that is helping many IT organizations make the transition toward the next generation of cloud infrastructures running multiple fully-isolated Virtual Data Centers. This article presents an overview of the VDC model, the VDC support available in OpenNebula 3.0, and some examples of deployment scenarios.

What Is a VDC?

A Virtual Data Center is a fully-isolated virtual infrastructure environment where a group of users, under the control of the VDC administrator, can create and manage compute, storage and networking capacity. VDCs are a powerful instrument to compartmentalize a cloud infrastructure and to support organizational isolation with advanced multi-tenancy. The cloud administrator creates a VDC by assigning a group of users to a group of physical resources and by granting at least one of the users, the VDC administrator, with privileges to manage all virtual resources in the VDC. The users in the VDC, including the VDC administrator, only see the virtual resources and not the underlying physical infrastructure. The physical resources allocated by the cloud administrator to the VDC can be shared among other VDCs or completely dedicated to the VDC, providing isolation at the physical level too.

A powerful ACL system behind OpenNebula’s VDCs allows different authorization scenarios. The privileges of the VDC users and the administrator regarding the operations over the virtual resources created by the rest of users can be configured. In a typical scenario the VDC administrator can create virtual networks, upload and create images and templates, and monitor other users virtual resources, while the users can only instantiate virtual machines and virtual networks to create their services. The administrators of the VDC have full control over resources and can also create new users in the VDC.

Users can then access their VDCs through any of the existing OpenNebula interfaces, such as the CLI, SunStone, OCA, or the OCCI and AWS APIs. VDC administrators can manage their VDCs through the CLI or new tabs in SunStone. Cloud Administrators can manage the VDCs through a new CLI or the new SunStone Zones.

VDCs have three categories of users:

  • Cloud administrator/s with full control over the cloud deployment including the creation and management of VDCs
  • VDC administrator/s with full control over the virtual resources within their VDCs including the creation of users in their VDCs
  • Regular users that can access their VDCs to manage their virtual resources

Examples of Enterprise Use Cases of VDCs

VDCs, and the underlying ACL system, can support many common enterprise use cases in large cloud computing deployments, for example:

  • On-premise Private Clouds Serving Multiple Projects, Departments, Units or Organizations. On-premise private clouds in large organizations require powerful and flexible mechanisms to manage the access privileges to the virtual and physical infrastructure and to dynamically allocate the available resources. In these scenarios, the cloud administrator would create a VDC for each Department, dynamically allocation physical hosts according to their needs, and delegating the internal administration of the VDC to the Department IT administrator.
  • Cloud Providers Offering Virtual Private Cloud Computing. There is a growing number of cloud providers, especially Telecom Operators, that are offering Virtual Private Cloud environments to extend the Private Clouds of their customers over virtual private networks, thus offering a more reliable and secure alternative to traditional Public Cloud providers. In this new cloud offering scenario, the cloud provider provides customers with a fully-configurable and isolated VDC where they have full control and capacity to administer its users and resources. This combines a public cloud with the protection and control usually seen in a personal private cloud system. Users can themselves create and configure servers via the SunStone portal or any of the supported cloud APIs. The total amount of physical resources allocated to the virtual private cloud can also be adjusted.

Are You Ready to Try the New OpenNebula Zones?

OpenNebula 3.0 is a fully open-source technology. You have the software, the guides and our support to deploy your cloud infrastructure with multiple VDC environments.

In this post, I will explain how to install OpenNebula on two servers in a fully redundant environment. This is the English translation of an article in Italian on my blog.

The idea is to have two Cloud Controllers in High Availability (HA) active/passive mode using Pacemaker/Heartbeat. These nodes will also provide storage by exporting a DRBD partition via ATA-Over-Ethernet; the VM disks will be created on logical LVM volumes in this partition. This solution, besides being totally redundant, will provide high-speed storage because we use snapshots to deploy the partitions of the VM, not using files on an NFS filesystem.

Nonetheless, we will still use NFS to export the /srv/cloud directory with OpenNebula data.

System Configuration

As a reference, this is the configuration of our own servers. Your servers do not have to be exactly the same; we will simply be using these two servers to explain certain aspects of the configuration.

First Server:

  • Linux Ubuntu 64-bit server 10.10
  • Cards eth0 and eth1 configured with IP bonding network (SAN)
  • ETH2 card with IP (LAN)
  • 1 TB internal HD partitioned as follows:
    • sda1: 40 GB mounted on /
    • sda2: 8 GB swap
    • sda3: 1 GB for metadata
    • sda5: 40 GB for /srv/cloud/one
    • sda6: 850 GB datastore

Secondary Server

  • Linux Ubuntu 64-bit server 10.10
  • Cards eth0 and eth1 configured with IP bonding network (SAN)
  • ETH2 card with IP (LAN)
  • 1 TB internal HD partitioned as follows:
    • sda1: 40 GB mounted on /
    • sda2: 8 GB swap
    • sda3: 1 GB for metadata
    • sda5: 40 GB for /srv/cloud/one
    • sda6: 850 GB datastore

Installing the base system

Install Ubuntu server 64-bit 10.10 on the two servers and enabling OpenSSH server during installation. In our case, the servers are each equipped with a double-disk 1TB SATA in hardware mirror, on which we will create a 40 GB partition (sda1) for the root filesystem, a 4 GB (sda2) for the swap, a third ( sda3) of 1 GB formetadata , a fourth (sda5) with 40 GB for the directory /srv/cloud/one replicated by DRBD, and a fifth (sda6) with the remaining space (approximately 850 GB) that will be used by DRBD for the export of VM filesystems.

In terms of network cards, we have a total of three network cards to each server: 2 (eth0, eth1) will be configured in bonding to manage data replication and communicate with the compute nodes in the cluster network (SAN) on the class and a third (eth2) is used to access from outside the cluster on the LAN with class.

Unless otherwise specified, these instructions are specific to the above two hosts, but should work on your own system with minor modifications.

Network Configuration

First we modify the hosts file:

/etc/hosts cloud-cc.lan.local cloud-cc cloud-cc01.lan.local cloud-cc02.lan.local cloud-01.san.local cloud-02.san.local cloud-03.san.local cloud-cc.san.local cloud-cc01.san.local cloud-cc01 cloud-cc02.san.local cloud-cc02

Next, we proceed to the configuration of the system. First configure the bonding interface, installing required packages:

apt-get install ethtool ifenslave

Then we load the module at startup with correct parameters creating file /etc/modprobe.d/bonding.conf

alias bond0
bonding options mode=0 miimon=100 downdelay=200 updelay=200

And configuring LAN:

auto bond0
iface bond0 inet static
bond_miimon  100
bond_mode balance-rr
address # on server 2
up /sbin/ifenslave bond0 eth0 eth1
down /sbin/ifenslave -d bond0 eth0 eth1

auto eth2
iface eth2 inet static
address # on server 2

Configuring MySQL

I prefer to configure a MySQL circular replication rather than to manage the launch of the service through HeartBeat because MySQL is so fast in the opening; having been active on both servers, they save a few seconds during the switch in case of a fault.

First we install MySQL:

apt-get install mysql-server libmysqlclient16-dev libmysqlclient

and create the database for OpenNebula:

mysql -p
create database opennebula;
create user oneadmin identified by 'oneadmin';
grant all on opennebula.* to 'oneadmin'@'%';

Then we configure active/active replica on server 1:

/etc/mysql/conf.d/replica.cnf @ Server 1
bind-address			=
server-id                       = 10
auto_increment_increment        = 10
auto_increment_offset           = 1
master-host                     = server2.dominio.local
master-user                     = replicauser
master-password                 = replicapass
log_bin				= /var/log/mysql/mysql-bin.log
binlog_ignore_db		= mysql

And on server 2:

/etc/mysql/conf.d/replica.cnf @ server 2
bind-address			=
server-id                       = 20
auto_increment_increment        = 10
auto_increment_offset           = 2
master-host                     = server1.dominio.local
master-user                     = replicauser
master-password                 = replicapass
log_bin				= /var/log/mysql/mysql-bin.log
binlog_ignore_db		= mysql

Finally, on both servers, restart mysql and create replica user:

create user 'replicauser'@'%.san.local' identified by 'replicapass';
grant replication slave on *.* to 'replicauser'@'%.dominio.local';
start slave;
show slave status\G;

DRBD Configuration

Now is the turn of DRBD but configured in standard active/passive. First install the needed packages:

apt-get install drbd8 modprobe drbd-utils

So let’s edit the configuration file:

global {
usage-count yes;
# minor-count dialog-refresh disable-ip-verification

common {
protocol C;

handlers {
pri-on-incon-degr "/usr/lib/drbd/; /usr/lib/drbd/; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/; /usr/lib/drbd/; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/; /usr/lib/drbd/; echo o > /proc/sysrq-trigger ; halt -f";
# fence-peer "/usr/lib/drbd/";
# split-brain "/usr/lib/drbd/ root";
# out-of-sync "/usr/lib/drbd/ root";
# before-resync-target "/usr/lib/drbd/ -p 15 -- -c 16k";
# after-resync-target /usr/lib/drbd/;

startup {
# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
wfc-timeout 120; ## 2 min
degr-wfc-timeout 120; ## 2 minutes.

disk {
# on-io-error fencing use-bmbv no-disk-barrier no-disk-flushes
# no-disk-drain no-md-flushes max-bio-bvecs
on-io-error detach;

net {
# sndbuf-size rcvbuf-size timeout connect-int ping-int ping-timeout max-buffers
# max-epoch-size ko-count allow-two-primaries cram-hmac-alg shared-secret
# after-sb-0pri after-sb-1pri after-sb-2pri data-integrity-alg no-tcp-cork
# allow-two-primaries;
# after-sb-0pri discard-zero-changes;
# after-sb-1pri discard-secondary;

timeout 60;
connect-int 10;
ping-int 10;
max-buffers 2048;
max-epoch-size 2048;

syncer {
# rate after al-extents use-rle cpu-mask verify-alg csums-alg
rate 500M;

And let’s create one-disk definition:

resource one-disk {
    on cloud-cc01 {
	device /dev/drbd1;
	disk /dev/sda5;
	meta-disk /dev/sda3[0];
    on cloud-cc02 {
	device /dev/drbd1;
	disk /dev/sda5;
	meta-disk /dev/sda3[0];

and data-disk:

resource data-disk {
    on cloud-cc01 {
	device /dev/drbd2;
	disk /dev/sda6;
	meta-disk /dev/sda3[1];
    on cloud-cc02 {
	device /dev/drbd2;
	disk /dev/sda6;
	meta-disk /dev/sda3[1];

Now, on both nodes, we create the metadata disk:

drbdadm create-md one-disk
drbdadm create-md data-disk
/etc/init.d/drbd reload

Finally, only on server 1, activate the disk:

drbdadm -- --overwrite-data-of-peer primary one-disk
drbdadm -- --overwrite-data-of-peer primary data-disk

Exporting the disks

As already mentioned, the two DRBD partitions will be visible through the network, although in different ways: one-disk will be exported through NFS, data-disk will be exported by ATA-over-Ethernet and will present its LVM partitions to the hypervisor.

Install the packages:

apt-get install vblade nfs-common nfs-kernel-server nfs-common portmap

We’ll disable automatic NFS and AoE startup because we handle it via HeartBeat:

update-rc.d nfs-kernel-server disable
update-rc.d vblade disable

Then we create the export for OpenNebula directory:


and we create necessary directory:

mkdir -p /srv/cloud/one

Finally we have to set idmapd daemon to correctly propagate user and permission on network.


Verbosity = 0
Pipefs-Directory = /var/lib/nfs/rpc_pipefs
Domain = lan.local # Modify this


Nobody-User = nobody
Nobody-Group = nobody

Finally we have to configure default NFS settings:

NEED_SVCGSSD=no # no is default


NEED_GSSD=no # no is default

Fault Tolerant daemon configuration

There are two packages that can handle high available services on Linux: corosync and heartbeat. Personally I prefer heartbeat and provide instructions referring to this, but most configurations will be through the pacemaker, then you are perfectly free to opt for corosync.

First install the needed packages:

apt-get install heartbeat pacemaker

and configure heartbeat daemon:

autojoin none
bcast bond0
warntime 3
deadtime 6
initdead 60
keepalive 1
node cluster-cc01
node cluster-cc02
crm respawn

Only on first server, we create the authkeys file, and will copy it on the second server:

( echo -ne "auth 1\n1 sha1 "; \
  dd if=/dev/urandom bs=512 count=1 | openssl md5 ) \
  > /etc/ha.d/authkeys
chmod 0600 /etc/ha.d/authkeys
scp /etc/ha.d/authkeys cloud-cc02:/etc/ha.d/
ssh cloud-cc02 chmod 0600 /etc/ha.d/authkeys
/etc/init.d/heartbeat restart
ssh cloud-02 /etc/init.d/heartbeat restart

After a minute or two, heartbeat will be online:

crm_mon -1 | grep Online
Online: [ cloud-cc0 cloud-cc02 ]

Now we’ll configure cluster services via pacemaker.
Setting default options:

crm configure
property no-quorum-policy=ignore
property stonith-enabled=false
property default-resource-stickiness=1000

The two shared IP and

crm configure
primitive lan_ip IPaddr params ip= cidr_netmask="" nic="eth2" op monitor interval="40s" timeout="20s"
primitive san_ip IPaddr params ip= cidr_netmask="" nic="bond0" op monitor interval="40s" timeout="20s"

The NFS export:

crm configure
primitive drbd_one ocf:linbit:drbd params drbd_resource="one-disk" op monitor interval="40s" timeout="20s"
ms ms_drbd_one drbd_one meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

The one-disk mount:

crm configure
primitive fs_one ocf:heartbeat:Filesystem params device="/dev/drbd/by-res/one-disk" directory="/srv/cloud/one" fstype="ext4"

The AoE export:

crm configure
primitive drbd_data ocf:linbit:drbd params drbd_resource="data-disk"  op monitor interval="40s" timeout="20s"
ms ms_drbd_data drbd_data meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

The data-disk mount:

crm configure
primitive aoe_data ocf:heartbeat:AoEtarget params device="/dev/drbd/by-res/data-disk" nic="bond0" shelf="0" slot=="0" op monitor interval="40s" timeout="20s"

Now we have to configure the correct order to startup services:

crm configure
group ha_group san_ip lan_ip fs_one nfs_one aoe_data
colocation ha_col inf: ha_group ms_drbd_one:Master ms_drbd_data:Master
order ha_after_drbd inf: ms_drbd_one:promote ms_drbd_data:promote ha_group:start

We will modify this configuration later to add OpenNebula and lighttpd startup.

LVM Configuration

LVM2 will allow us to create partitions for virtual machines and deploy it via snapshot basis.

Install the package on both machines.

apt-get install lvm2

We have to modify the filter configuration to allow lvm scan only to DRBD disk.

filter = [ "a|drbd.*|", "r|.*|" ]
write_cache_state = 0

ATTENTION: Ubuntu uses a Ramdisk to bootup the system, so we have to modify also lvm.conf file inside ramdisk.

Now we remove the cache:

rm /etc/lvm/cache/.cache

Only on server 1 we have to create physical LVM volume and Volume Group:

pvcreate /dev/drbd/by-res/data-disk
vgcreate one-data /dev/drbd2

Install and configure OpenNebula

We are almost done. Now we download and install OpenNebula 2.2 via source:

First we have to install prerequisites:

apt-get install libsqlite3-dev libxmlrpc-c3-dev scons g++ ruby libopenssl-ruby libssl-dev ruby-dev make rake rubygems libxml-parser-ruby1.8 libxslt1-dev libxml2-dev genisoimage  libsqlite3-ruby libsqlite3-ruby1.8 rails thin
gem install nokogiri
gem install json
gem install sinatra
gem install rack
gem install thin
cd /usr/bin
ln -s rackup1.8 rackup

Then we have to create OpenNebula user and group:

groupadd cloud
useradd -d /srv/cloud/one  -s /bin/bash -g cloud -m oneadmin
chown -R oneadmin:cloud /srv/cloud/
chmod 775 /srv
id oneadmin # we have to use this id also on cluster node for oneadmin/cloud

Now we go in unpriviledged mode to create ssh certificate for cluster communications:

su - oneadmin
ssh-keygen # use default
cat ~/.ssh/ >> ~/.ssh/authorized_keys
chown 640 ~/.ssh/authorized_keys
mkdir  ~/.one

We create a .profile file with default variables:

export ONE_AUTH='/srv/cloud/one/.one/one_auth'
export ONE_LOCATION='/srv/cloud/one'
export ONE_XMLRPC='http://localhost:2633/RPC2'
export PATH=$PATH':/srv/cloud/one/bin'

Now we have to create one_auth file to setup a default user inside OpenNebula (for example of api or sunstone):


And load default variables before compile:

source .profile

Now download and install OpenNebula:

tar zxvf opennebula-2.2.tar.gz
cd opennebula-2.2
scons -j2 mysql=yes
./ -d /srv/cloud/one

About configuration: this is my oned.conf file, I use Xen HyperVisor, but you can use also KVM.






DB = [ backend = "mysql",
       server  = "localhost",
       port    = 0,
       user    = "oneadmin",
       passwd  = "oneadmin",
       db_name = "opennebula" ]




MAC_PREFIX   = "02:ab"

IMAGE_REPOSITORY_PATH = /srv/cloud/one/var/images

IM_MAD = [
    name       = "im_xen",
    executable = "one_im_ssh",
    arguments  = "xen" ]

VM_MAD = [
    name       = "vmm_xen",
    executable = "one_vmm_ssh",
    arguments  = "xen",
    default    = "vmm_ssh/vmm_ssh_xen.conf",
    type       = "xen" ]

TM_MAD = [
    name       = "tm_lvm",
    executable = "one_tm",
    arguments  = "tm_lvm/tm_lvm.conf" ]

HM_MAD = [
    executable = "one_hm" ]

    name      = "image",
    on        = "DONE",
    command   = "image.rb",
    arguments = "$VMID" ]

    name      = "error",
    on        = "ERROR",
    command   = "host_error.rb",
    arguments = "$HID -r n",
    remote    = "no" ]

   name      = "on_failure_resubmit",
   on        = "FAILED",
   command   = "/usr/bin/env onevm resubmit",
   arguments = "$VMID" ]

The only important thing is to modify /srv/cloud/one/etc/tm_lvm/tm_lvm.rc setting default VG:


Now copy the init.d script from source to /etc/init.d but not set it to startup ad boot.

I have modified the default script to startup also sunstone:

#! /bin/sh
# Provides:          opennebula
# Required-Start:    $remote_fs
# Required-Stop:     $remote_fs
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: OpenNebula init script
# Description:       OpenNebula cloud initialisation script

# Author: Soren Hansen - modified my Alberto Zuin

DESC="OpenNebula cloud"

# Exit if the package is not installed
[ -x "$DAEMON" ] || exit 0

# Load the VERBOSE setting and other rcS variables
. /lib/init/

# Define LSB log_* functions.
# Depend on lsb-base (>= 3.0-6) to ensure that this file is present.
. /lib/lsb/init-functions

# Function that starts the daemon/service
mkdir -p /var/run/one /var/lock/one
chown oneadmin /var/run/one /var/lock/one
su - oneadmin -s /bin/sh -c "$DAEMON start"
su - oneadmin -s /bin/sh -c "$SUNSTONE start"

# Function that stops the daemon/service
su - oneadmin -s /bin/sh -c "$SUNSTONE stop"
su - oneadmin -s /bin/sh -c "$DAEMON stop"

case "$1" in
[ "$VERBOSE" != no ] && log_daemon_msg "Starting $DESC" "$NAME"
case "$?" in
0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
[ "$VERBOSE" != no ] && log_daemon_msg "Stopping $DESC" "$NAME"
case "$?" in
0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
# If the "reload" option is implemented then remove the
# 'force-reload' alias
log_daemon_msg "Restarting $DESC" "$NAME"
case "$?" in
case "$?" in
0) log_end_msg 0 ;;
1) log_end_msg 1 ;; # Old process is still running
*) log_end_msg 1 ;; # Failed to start
# Failed to stop
log_end_msg 1
echo "Usage: $SCRIPTNAME {start|stop|restart|force-reload}" >&2
exit 3


and set it with execute permissions:

chmod 755 /etc/init.d/one

Configuring the HTTPS proxy for Sunstone

Sunstone is the web interface for Cloud administration, if you do not want to use the command line… works on port 4567 and is not encrypted, so we’ll use lighttpd for proxy requests to HTTPS encrypted connection.

First install the daemon:

apt-get install ssl-cert lighttpd

Then generate certificates:

/usr/sbin/make-ssl-cert generate-default-snakeoil
cat /etc/ssl/private/ssl-cert-snakeoil.key /etc/ssl/certs/ssl-cert-snakeoil.pem > /etc/lighttpd/server.pem

and create symlinks to enable ssl and proxy modules:

ln -s /etc/lighttpd/conf-available/10-ssl.conf /etc/lighttpd/conf-enabled/
ln -s /etc/lighttpd/conf-available/10-proxy.conf /etc/lighttpd/conf-enabled/

And modify lighttp setup to enable proxy to sunstone:

proxy.server               = ( "" =>
                                ("" =>
                                 "host" => "",
                                 "port" => 4567

Starting LightHTTP and OpenNebula with heartbeat

Now add the startup script automatically to heartbeat. First all stop heartbeat on both servers:

crm node
standby cloud-cc01
standby cloud-cc02

Then we can change the configuration:

crm configure
primitive OpenNebula lsb:one
primitive lighttpd lsb:lighttpd
delete ha_group
group ha_group san_ip lan_ip fs_one nfs_one aoe_data OpenNebula lighttpd
colocation ha_col inf: ha_group ms_drbd_one:Master ms_drbd_data:Master
order ha_after_drbd inf: ms_drbd_one:promote ms_drbd_data:promote ha_group:start

And startup the cluster again:

crm node
online cloud-cc01
online cloud-cc02

That’s all folks!
Alberto Zuin –

Thanks to the hybrid cloud computing functionality offered by OpenNebula, external clouds are seen in an OpenNebula system as other local resources. In this way, Sunstone allows the management of all the public clouds that are used in the hybrid configuration of any given enterprise.

This setup is shown in the screenshot, where a private cloud is shown with associated external resources (in this case, three Amazon availability zones), where the compute power can be outsourced during peak demands.

HPC in the Cloud has just published an article with a description of the progress made in the implementation of the CERN IaaS cloud. Rubén S. Montero, our Chief Architect, shares insights about managing CERN’s infrastructure. At the heart of the IaaS cloud CERN has implemented is OpenNebula, which is starting to serve as the management layer in production following extensive prototyping and testing. Montero describes the project’s evolution and current status.

C12G Labs has published four use cases describing how OpenNebula can be used to build private and hybrid clouds in different domains:

The Management of Data Information, and Knowledge Group (MaDgIK) at the University of Athens focuses on several research areas, such as Database and Information Systems, Distributed Systems, Query Optimization, and Digital Libraries. During the past few years, projects within this group started offering and sharing hardware resources through a virtualized infrastructure. We eventually built our own IaaS-cloud using open source software, namely Xen, Debian and OpenNebula. However, that was not enough, since we needed custom solutions to suit our needs. At that point we called upon Eolus, the god of the winds, to blow and shape the clouds.

Eolus is our open source attempt to join the forces of OpenNebula and Java Enterprise Edition. It is far from being an end product, yet it provides functionality that serves our purposes. In short, OpenNebula is used as a management tool for virtual resources that we exploit in building higher level custom services available through a JEE application container. An advanced VM scheduler called Nefeli and a web based administration console are only a couple of such high level components we offer to users. Our success stories include undergraduate theses, researchers and European funded projects (e.g. D4Science) experimenting and exploiting cloud resources. You can have a glimpse of our efforts, released under the EUPL licence, at

The Supercomputing Center of Galicia (CESGA) and the Supercomputing Center Foundation of Castilla y León (FCSCL) have built a federation of cloud infrastructures using the hybrid cloud computing functionality provided by OpenNebula. Both organizations have collaborated in order to execute an application to fight Malaria across both sites. This is a very interesting use case of cloud federation in the High Performance Computing field.

Last week at ISC Cloud 2010, Ulrich Schwickerath, from the CERN IT-PES/PS Group, presented the last benchmarking results of CERN’s OpenNebula cloud for batch processing. The batch computing farm makes a critical part of the CERN data centre. By making use of the new IaaS cloud, both the virtual machine provisioning system and the batch application itself have been tested extensively at large scale. The results show OpenNebula managing 16,000 virtual machines to support a virtualized computing cluster that executes 400,000 jobs.

OpenNebula 2.0 emphasizes interoperability and portability, providing cloud users and administrators with choice across most popular cloud interfaces, hypervisors and public clouds for hybrid cloud computing deployments, and with a flexible software that can be installed in any hardware and software combination. The functionality provided by the new version of OpenNebula and the components in its quickly growing ecosystem enable:

Because two data centers are not the same, building a cloud computing infrastructure requires the integration and orchestration of the underlying existing IT systems, services and processes. OpenNebula enables interoperability and portability, recognizing that our users have data-centers composed of different hardware and software components for security, virtualization, storage, and networking. Its open, architecture, interfaces and components provide the flexibility and extensibility that many enterprise IT shops need for internal cloud adoption. You only have to chose the right design and configuration in your Cloud architecture depending on your existing IT architecture and the execution requirements of your service workload.

Ignacio M. Llorente

The D-Grid Resource Center Ruhr (DGRZR) was established in 2008 at Dortmund University of Technology as part of the German Grid initiative D-Grid. In contrast to other resources, DGRZR used virtualization technologies from the start and still runs all Grid middleware, batch system and management services in virtual machines. In 2010, DGRZR was extended by the installation of OpenNebula as its Compute Cloud middleware to manage our virtual machines as a private cloud.

At present, the resource center is not only a production site of D-Grid, but also of NGI-DE (National Grid Initiative-Deutschland). Additionally it will be used as prototype for the integration of an EC2-compatible Compute Cloud middleware as a new pillar in the D-Grid software stack. After successful integration, DGRZR will act as public cloud resource and allow D-Grid members to deploy their virtual appliances.
The following diagram summarizes the DGRZR architecture:

OpenNebula at D-Grid Resource Center Ruhr

OpenNebula at D-Grid Resource Center Ruhr

Physical resources:

DGRZR consists of 256 HP blade servers with eight CPU cores (2048 cores in total) and 16 Gigabyte RAM each. The disk space per server is about 150 Gigabytes. 50% of this space is reserved for virtual machine images. The operating system on the physical servers is SUSE Enterprise Linux (SLES) 10 Service Pack 3 and will be changed to SLES 11 in the near future. We provide our D-Grid users with roughly 100 terabytes of central storage, mainly for home directories, experiment software and for the dCache Grid Storage Element. In 2009, the mass storage was upgraded by adding 25 terabyte of HP Scalable File Share 3.1 (a Lustre-like file system) and is currently migrated to version 3.2. 250 of the  256 blade servers will typically be running virtual worker nodes. The remaining servers run virtual machines for  the Grid middleware services (gLite, Globus Toolkit and UNICORE), the batch system server, and other management services.


The network configuration of the resource center is static and assumes a fixed mapping from the MAC of the virtual machine to its public IP address. For each available node type (worker nodes, Grid middleware services and management services) a separate virtual LAN exists and DNS names for the possible leases have been setup in advance in the central DNS servers of the university.

Image repository and distribution:

The repository consists of images for the worker nodes based on Scientific Linux 4.8 and 5.4, UNICORE and also Globus Toolkit services. We will soon be working on creating of images for the gLite services.

The master images that are cloned to the physical servers are located on a NFS server and are kept up to date manually. The initial creation of such images (including installation and configuration of Grid services) is currently done manually, but will be replaced in near future by automated workflows. The distribution of those images to the physical servers happens on demand and uses the OpenNebula SSH transfer mechanism. Currently we have no need for pre-staging virtual machine images to the physical servers, but we may add this using scp-wave.

The migration of virtual machines has been tested in conjunction with SFS 3.1, but production usage has been postponed until the completion of the file system upgrade.


The version currently used is an OpenNebula  1.4 GIT snapshot from March 2010. Due to some problems of SLES10 with Xen (e. g. “tap:aio” not really working) modifications to the snapshot were made. In addition to this, we setup the OpenNebula Management Console and use it as a graphical user interface.

The SQLite3 database back-end performs well for the limited number of virtual machines we are running, but with the upgrade to OpenNebula 1.6 we will migrate to a MySQL back-end to prepare for an extension of our cloud to other clusters. Using Haizea as lease manager seems out of scope at the moment. With the upcoming integration of this resource as D-Grid IaaS resource, scheduler features like advanced reservations are mandatory.

Stefan Freitag (Robotics Research Institute, TU Dortmund)
Florian Feldhaus (ITMC, TU Dortmund)