Posts

China Mobile’s BigCloud Elastic Computing System Based on Opennebula

Big Cloud is the cloud computing software stack developed by China Mobile Research Institute to support China Mobile‘s operation platform and provide services to its more than 600 million customers.

BC-EC, Big Cloud Elastic Computing, chose OpenNebula as its core component to manage and schedule the virtualization infrastructure in 2008. Since then, we are glad to see OpenNebula to be more full-fledged each day. And BC-EC, matured with OpenNebula, was used both in China Mobile’s internal business and ready to provide public service.

BC-EC includes 3 parts: web portal providing self-service entry, front-end management service providing service and operation management, user management and billing etc, and service database.  Front-end can provide similar functions as OZone to manage several OpenNebula back-ends.

Recently China Mobile prepared to publish a public cloud service based on BC-EC solution. This cloud includes 1000 servers, 700 servers of those will provide virtual machine computing service based on BC-EC and another 300 servers will provide cloud storage based on Big Cloud’s object store system, named Onest.

We hope to make deep cooperation with OpenNebula community to improve it with our requirements and experiences, contributing bug fixes and developing new features.

FutureGrid Image Management for Cloud/HPC Infrastructures with OpenNebula

FutureGrid (FG) is a testbed providing users with grid, cloud, and high performance computing infrastructures. FG employs both virtualized and non-virtualized infrastructures. Within the FG project, we offer different IaaS frameworks as well as high performance computing infrastructures by allowing users to explore them as part of the FG testbed.

To ease the use of these infrastructures, as part of performance experiments, we have designed an image management framework, which allows us to create user defined software stacks based on abstract image management and uniform image registration. Consequently, users can create their own customized environments very easily. The complex processes of the underlying infrastructures are managed by our software tools and services. These software tools are not only able to manage images for IaaS frameworks, but they also allow the registration and deployment of images onto bare-metal by the user. This level of functionality is typically not offered in a HPC (high performance computing) infrastructure. Therefore, our approach changes the paradigm of administrator-controlled dynamic provisioning to user-controlled dynamic provisioning, which we also call raining. Thus, users obtain access to a testbed with the ability to manage state-of-the-art software stacks that would otherwise not be supported in typical compute centers. Security is also considered by vetting images before they are registered in a infrastructure. Figure 1 shows the architecture of the image management framework.

Figure 1. Image Management architecture

This framework defines the full life cycle of the images in FutureGrid. It involves the process of creating, customizing, storing, sharing, and registering images for different FG environments. To this end, we have several components to support the different tasks involved. First, we have an Image Generation tool that creates and customizes images according to user requirements (see Figure 2-a). The second component is the Image Repository, which is in charge of storing, cataloging and sharing images. The last component is an Image Registration tool, which prepares, uploads and registers images for specific environments, like HPC or different cloud frameworks (see Figure 2-b). It also decides if an image is secure enough to be registered or if it needs additional security tests.

Figure 2. Image Generation and Image Registration flow charts.

Within this framework, OpenNebula plays an essential role supporting the image creation process. As we can see in Figure 2-a, the image generation component is able to create images from scratch or by cloning images from our image repository. In case we generate an image from scratch, the image is created using the tools to bootstrap images provided by the different OSes, such as yum for CentOS and deboostrap for Ubuntu. To deal with different OSes and architectures, we use cloud technologies. Consequently, an image is created with all the user’s specified packages inside a VM instantiated on-demand by OpenNebula. Therefore, multiple users can create multiple images for different operating systems concurrently; obviously, this approach provides us with great flexibility, architecture independence, and high scalability.

More information in:

  • J. Diaz, G.v. Laszewski, F. Wang, and G. Fox. “Abstract Image Management and Universal Image Registration for Cloud and HPC Infrastructures”, IEEE Cloud 2012, Honolulu, Hawaii, June 2012.
  • FutureGrid Rain Software Documentation. http://futuregrid.github.com/rain
  • FutureGrid Portal. https://portal.futuregrid.org/

OpenNebula book released!

I am pleased to announce that the first book on OpenNebula, rumored a few months ago, is finally available!

The book has been published by Packt Publishing and is a practical step-by-step guide for newcomers, including:

  • Planning the hardware infrastructure and keeping resources and hardware under monitoring
  • Installing OpenNebula, from sources or binary distribution and configuring it on your front-end host
  • Installing and configuring KVM, Xen and VMware ESXi on your hosts, building from sources when needed
  • Integrating with existing NAS/SAN infrastructure or providing flexible and scalable storage with distributed file-systems (GlusterFS, MooseFS)
  • Managing day to day virtual instances via both command-line and Sunstone web interfaces
  • Monitoring infrastructure continuously using Ganglia
  • Extending your private cloud with resources from Amazon EC2
  • Providing Cloud resources to external facilities through EC2 and OCCI interfaces

You can view the sample chapter and prefaces of the book, including the foreword section written by Ignacio M. Llorente and Rubén S. Montero on PacktLib (or you can download the sample chapter in PDF format).

Experiences in Building a Private Cloud in the Engineering Department at CERN

Virtualization technology and cloud computing have brought a paradigm shift in the way we utilize, deploy and manage computer resources. They allow fast deployment of multiple operating system as containers on physical machines which can be either discarded after use or check pointed for later re-deployment. At European Organization for Nuclear Research (CERN), we have been using virtualization technology to quickly setup virtual machines for our developers with pre-configured software to enable them to quickly test/deploy a new version of a software patch for a given application. This article reports both on the techniques that have been used to setup a private cloud on commodity hardware and also presents the optimization techniques we used to remove deployment specific performance bottlenecks.

The key motivation to opt for a private cloud has been the way we use the infrastructure. Our user community includes developers, testers and application deployers who need to provision machines very quickly on-demand to test, patch and validate a given configuration for CERN’s control system applications. Virtualized infrastructure along with cloud management software enabled our users to request new machines on-demand and release them after their testing was complete.

Physical Infrastructure

Implementation

The hardware we use for our experimentation is HP Proliant 380 G4 machines with 8GB of memory, 500 GByte of disk and connected with Gigabit Ethernet. Five servers were running VMWare ESXi bare-metal hypervisor to provide virtualization capabilities. We also evaluated Xen hypervisor with Eucalyptus cloud but given our requirements for Windows VMs, we opted for VMWare ESXi. OpenNebula Professional (Pro) was used as cloud front-end to manage ESXi nodes and to provide users with an access portal.

Deployment architecture with OpenNebula, VMWare ESXi and OpenStack Glance image service.

A number of deployment configurations were tested and their performance was benchmarked. The configuration we tested for our experimentation are the following:

  • Central storage with front end (arch_1): a shared storage and OpenNebula Pro runs on two different servers. All VM images reside on shared storage all the time.
  • Central storage without front end (arch_2): a shared storage, using network filesystem (NFS), shares the same server with OpenNebula front end. All VM images reside on shared storage all the time.
  • Distributed storage remote copy (arch_3): VM images are deployed to each ESXi node at deployment time, and copied using Secure Shell (SSH) protocol by front end’s VMWare transfer driver.
  • Distributed storage local copy (arch_4): VM images are managed by an image manager service which downloads images pre-emptively on all ESXi nodes. Front end runs on a separate server and setup VM using locally cached images.

Each of the deployment configuration has its advantages and disadvantages. arch_1 and arch_2 use a shared storage model where all VM’s are setup on a central storage. When a VM request is sent to the front end, it clones an existing template image and sets it up on the central storage. Then it communicates the memory/networking configuration to the ESXi server, and pointing the location of the VM image. The advantage of these two architectural configurations is that it simplifies the management of template images as all of the virtual machine data is stored on the central server. The disadvantage of this approach is that in case of a disk failure on the central storage, all the VMs will lose data. And secondly, the system performance can be seriously degraded if shared storage is not high performance and doesn’t have high-bandwidth connectivity with ESXi nodes. Central storage becomes the performance bottleneck for these approaches.

arch_3 and arch_4 tries to overcome this shortcoming by using all available disk space on the ESXi servers. The challenge here is how to clone and maintain VM images at run time and to refresh them when they get updated. arch_3 resolves both of these challenges by copying the VM images at request time to the target node (using the VMWare transfer script add-on from OpenNebula Pro software), and when the VM is shut then the image is removed from the node. For each new request, a new copy of the template image is sent over the network to the target node. Despite its advantages, network bandwidth and ability of the ESXi nodes to make copies of the template images becomes the bottleneck. arch_4 is our optimization strategy where we implement an external image manager service that maintains and synchronize a local copy of each template image on each ESXi node using OpenStack’s Image and Registry service called Glance . This approach resolves both storage and network bandwidth issues.

Finally, we empirically tested all architectures to answer the following questions:

  • How quickly can the system deploy a given number of virtual machines?
  • Which storage architecture (shared or distributed) will deliver optimal performance?
  • What will be average wait-time for deploying a virtual machine?

Results

All four different architectures were evaluated for four different deployment scenarios. Each scenario was run three times and the results were averaged and are presented in this section. Any computing infrastructure when used by multiple users goes under different cycles of demand which results in reduced supply of available resources on the infrastructure to deliver optimal service quality.

We were particularly interested in following deployment scenarios where 10 virtual machines (each 10 GB each) were deployed:

  • Single Burst (SB): All virtual machines are sent in a burst mode but restricted to one server only. This is the most resource-intensive request.
  • Multi Burst (MB): All virtual machines were sent in a burst mode to multiple servers.
  • Single Interval (SI): All virtual machines were sent after an interval of 3 mins to one server only.
  • Multi Interval (MI): All virtual machines were sent after an interval of 3 mins to multiple servers. This is the least resource-intensive request.

Aggregated deployment times for various architectures

The results shows that by integrating locally cached images and managing them using OpenStack image services, we were able to deploy our VM’s with in less then 5 mins. Remote copy technique is very useful when image sizes are smaller but as the image size increases, and number of VM requests increases; then it adds up additional load on the network bandwidth and increases the time to deploy a VM.

Conclusion

The results have also shown that distributed storage using locally cached images when managed using a centralized cloud platform (in our study we used OpenNebula Pro) is a practical option to setup local clouds where users can setup their virtual machines on demand within 15mins (from request to machine boot up) while keeping the cost of the underlying infrastructure low.

OneVBox: New VirtualBox driver for OpenNebula

This new contribution to the OpenNebula Ecosystem expands OpenNebula by enabling the use of the well-known hypervisor VirtualBox to create and manage virtual machines.

OneVBox supports the upcoming OpenNebula 3.0 (currently in beta) and VirtualBox 4.0. It is composed of several scripts, mostly written in Ruby, which interpret the XML virtual machine descriptions provided by OpenNebula and perform necessary actions in the VirtualBox node.

OneVBox can deploy but also save, restore and migrate VirtualBox VMs from one physical node to a different one.

Using the new OneVBox driver is very easy and can be done in a few steps:

  1. Download and install the driver. Run from the driver folder:
    user@frontend $> ./install.sh

    Make sure that you have permissions to write in the OpenNebula folders. $ONE_LOCATION can be used to define the self-contained install path, otherwise it will be installed in system-wide mode.

  2. Enable the plugin. Put this in the oned.conf file and start OpenNebula: [shell]
    IM_MAD = [
    name = "im_vbox",
    executable = "one_im_ssh",
    arguments = "-r 0 -t 15 vbox" ]

    VM_MAD = [
    name = "vmm_vbox",
    executable = "one_vmm_exec",
    arguments = "vbox",
    default = "vmm_exec/vmm_exec_vbox.conf",
    type = "xml" ]
    [/shell]

  3. Add a VirtualBox host. For example:
    oneadmin@frontend $> onehost create hostname im_vbox vmm_vbox tm_ssh

    OneVBox also includes ab OpenNebula Sunstone plugin that will enable adding VirtualBox hosts and creating VirtualBox VM templates from the web interface. In order to enable it just add the following lines to etc/sunstone-plugins.yaml:

    [shell]
    – user-plugins/vbox-plugin.js:
    :group:
    :ALL: true
    :user:
    [/shell]

    (Tip: When copy/pasting, avoid using tabs in YAML files, they’re not supported)

For more information, you can visit the OpenNebula Ecosystem page for OneVBox. If you have questions or problems, please let us know on the Ecosystem mailing list or open an issue in the OneVBox github tracker.

OCCI 1.1 for OpenNebula

A recommendation for version 1.1 of the Open Cloud Computing Interface (OCCI) was recently released by the Open Grid Forum (OGF) (see OGF183 and OGF184). To add OCCI 1.1 support for OpenNebula, we created the Ecosystem project “OCCI for OpenNebula”. The goal of the project is to develop a complete, robust and interoperable implementation of OCCI 1.1 for OpenNebula.

Although the project is still in an early stage, today we released a first version that supports creating and deleting Virtual Networks, Images and Machines. Information on installation and configuration of the OCCI 1.1 extension can be found in the Wiki of the project.

Florian Feldhaus, Piotr Kasprzak – TU Dortmund

Setting up High Availability in OpenNebula with LVM

In this post, I will explain how to install OpenNebula on two servers in a fully redundant environment. This is the English translation of an article in Italian on my blog.

The idea is to have two Cloud Controllers in High Availability (HA) active/passive mode using Pacemaker/Heartbeat. These nodes will also provide storage by exporting a DRBD partition via ATA-Over-Ethernet; the VM disks will be created on logical LVM volumes in this partition. This solution, besides being totally redundant, will provide high-speed storage because we use snapshots to deploy the partitions of the VM, not using files on an NFS filesystem.

Nonetheless, we will still use NFS to export the /srv/cloud directory with OpenNebula data.

System Configuration

As a reference, this is the configuration of our own servers. Your servers do not have to be exactly the same; we will simply be using these two servers to explain certain aspects of the configuration.

First Server:

  • Linux Ubuntu 64-bit server 10.10
  • Cards eth0 and eth1 configured with IP 172.17.0.251 bonding network (SAN)
  • ETH2 card with IP 172.16.0.251 (LAN)
  • 1 TB internal HD partitioned as follows:
    • sda1: 40 GB mounted on /
    • sda2: 8 GB swap
    • sda3: 1 GB for metadata
    • sda5: 40 GB for /srv/cloud/one
    • sda6: 850 GB datastore

Secondary Server

  • Linux Ubuntu 64-bit server 10.10
  • Cards eth0 and eth1 configured with IP 172.17.0.252 bonding network (SAN)
  • ETH2 card with IP 172.16.0.252 (LAN)
  • 1 TB internal HD partitioned as follows:
    • sda1: 40 GB mounted on /
    • sda2: 8 GB swap
    • sda3: 1 GB for metadata
    • sda5: 40 GB for /srv/cloud/one
    • sda6: 850 GB datastore

Installing the base system

Install Ubuntu server 64-bit 10.10 on the two servers and enabling OpenSSH server during installation. In our case, the servers are each equipped with a double-disk 1TB SATA in hardware mirror, on which we will create a 40 GB partition (sda1) for the root filesystem, a 4 GB (sda2) for the swap, a third ( sda3) of 1 GB formetadata , a fourth (sda5) with 40 GB for the directory /srv/cloud/one replicated by DRBD, and a fifth (sda6) with the remaining space (approximately 850 GB) that will be used by DRBD for the export of VM filesystems.

In terms of network cards, we have a total of three network cards to each server: 2 (eth0, eth1) will be configured in bonding to manage data replication and communicate with the compute nodes in the cluster network (SAN) on the class 172.17.0.0/24 and a third (eth2) is used to access from outside the cluster on the LAN 172.16.0.0/24 with class.

Unless otherwise specified, these instructions are specific to the above two hosts, but should work on your own system with minor modifications.

Network Configuration

First we modify the hosts file:

/etc/hosts
172.16.0.250 cloud-cc.lan.local cloud-cc
172.16.0.251 cloud-cc01.lan.local
172.16.0.252 cloud-cc02.lan.local
172.17.0.1 cloud-01.san.local
172.17.0.2 cloud-02.san.local
172.17.0.3 cloud-03.san.local
172.17.0.250 cloud-cc.san.local
172.17.0.251 cloud-cc01.san.local cloud-cc01
172.17.0.252 cloud-cc02.san.local cloud-cc02

Next, we proceed to the configuration of the system. First configure the bonding interface, installing required packages:

apt-get install ethtool ifenslave

Then we load the module at startup with correct parameters creating file /etc/modprobe.d/bonding.conf

/etc/modprobe.d/bonding.conf
alias bond0
bonding options mode=0 miimon=100 downdelay=200 updelay=200

And configuring LAN:

/etc/network/interfaces
auto bond0
iface bond0 inet static
bond_miimon  100
bond_mode balance-rr
address  172.17.0.251 # 172.17.0.251 on server 2
netmask  255.255.255.0
up /sbin/ifenslave bond0 eth0 eth1
down /sbin/ifenslave -d bond0 eth0 eth1

auto eth2
iface eth2 inet static
address  172.16.0.251 # 172.16.0.252 on server 2
netmask  255.255.255.0

Configuring MySQL

I prefer to configure a MySQL circular replication rather than to manage the launch of the service through HeartBeat because MySQL is so fast in the opening; having been active on both servers, they save a few seconds during the switch in case of a fault.

First we install MySQL:

apt-get install mysql-server libmysqlclient16-dev libmysqlclient

and create the database for OpenNebula:

mysql -p
create database opennebula;
create user oneadmin identified by 'oneadmin';
grant all on opennebula.* to 'oneadmin'@'%';
exit;

Then we configure active/active replica on server 1:

/etc/mysql/conf.d/replica.cnf @ Server 1
[mysqld]
bind-address			= 0.0.0.0
server-id                       = 10
auto_increment_increment        = 10
auto_increment_offset           = 1
master-host                     = server2.dominio.local
master-user                     = replicauser
master-password                 = replicapass
log_bin				= /var/log/mysql/mysql-bin.log
binlog_ignore_db		= mysql

And on server 2:

/etc/mysql/conf.d/replica.cnf @ server 2
[mysqld]
bind-address			= 0.0.0.0
server-id                       = 20
auto_increment_increment        = 10
auto_increment_offset           = 2
master-host                     = server1.dominio.local
master-user                     = replicauser
master-password                 = replicapass
log_bin				= /var/log/mysql/mysql-bin.log
binlog_ignore_db		= mysql

Finally, on both servers, restart mysql and create replica user:

create user 'replicauser'@'%.san.local' identified by 'replicapass';
grant replication slave on *.* to 'replicauser'@'%.dominio.local';
start slave;
show slave status\G;

DRBD Configuration

Now is the turn of DRBD but configured in standard active/passive. First install the needed packages:

apt-get install drbd8 modprobe drbd-utils

So let’s edit the configuration file:

/etc/drbd.d/global_common.conf
global {
usage-count yes;
# minor-count dialog-refresh disable-ip-verification
}

common {
protocol C;

handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
# fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
# split-brain "/usr/lib/drbd/notify-split-brain.sh root";
# out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
# before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
# after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
}

startup {
# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
wfc-timeout 120; ## 2 min
degr-wfc-timeout 120; ## 2 minutes.
}

disk {
# on-io-error fencing use-bmbv no-disk-barrier no-disk-flushes
# no-disk-drain no-md-flushes max-bio-bvecs
on-io-error detach;
}

net {
# sndbuf-size rcvbuf-size timeout connect-int ping-int ping-timeout max-buffers
# max-epoch-size ko-count allow-two-primaries cram-hmac-alg shared-secret
# after-sb-0pri after-sb-1pri after-sb-2pri data-integrity-alg no-tcp-cork
# allow-two-primaries;
# after-sb-0pri discard-zero-changes;
# after-sb-1pri discard-secondary;

timeout 60;
connect-int 10;
ping-int 10;
max-buffers 2048;
max-epoch-size 2048;
}

syncer {
# rate after al-extents use-rle cpu-mask verify-alg csums-alg
rate 500M;
}
}

And let’s create one-disk definition:

/etc/drbd.d/one-disk.res 
resource one-disk {
    on cloud-cc01 {
	address 172.17.0.251:7791;
	device /dev/drbd1;
	disk /dev/sda5;
	meta-disk /dev/sda3[0];
    }
    on cloud-cc02 {
	address 172.17.0.252:7791;
	device /dev/drbd1;
	disk /dev/sda5;
	meta-disk /dev/sda3[0];
    }
}

and data-disk:

/etc/drbd.d/data-disk.res 
resource data-disk {
    on cloud-cc01 {
	address 172.17.0.251:7792;
	device /dev/drbd2;
	disk /dev/sda6;
	meta-disk /dev/sda3[1];
    }
    on cloud-cc02 {
	address 172.17.0.252:7792;
	device /dev/drbd2;
	disk /dev/sda6;
	meta-disk /dev/sda3[1];
    }
}

Now, on both nodes, we create the metadata disk:

drbdadm create-md one-disk
drbdadm create-md data-disk
/etc/init.d/drbd reload

Finally, only on server 1, activate the disk:

drbdadm -- --overwrite-data-of-peer primary one-disk
drbdadm -- --overwrite-data-of-peer primary data-disk

Exporting the disks

As already mentioned, the two DRBD partitions will be visible through the network, although in different ways: one-disk will be exported through NFS, data-disk will be exported by ATA-over-Ethernet and will present its LVM partitions to the hypervisor.

Install the packages:

apt-get install vblade nfs-common nfs-kernel-server nfs-common portmap

We’ll disable automatic NFS and AoE startup because we handle it via HeartBeat:

update-rc.d nfs-kernel-server disable
update-rc.d vblade disable

Then we create the export for OpenNebula directory:

/etc/exports
/srv/cloud/one          172.16.0.0/24(rw,fsid=0,insecure,no_subtree_check,async)

and we create necessary directory:

mkdir -p /srv/cloud/one

Finally we have to set idmapd daemon to correctly propagate user and permission on network.

/etc/idmapd.conf
[General]

Verbosity = 0
Pipefs-Directory = /var/lib/nfs/rpc_pipefs
Domain = lan.local # Modify this

[Mapping]

Nobody-User = nobody
Nobody-Group = nobody

Finally we have to configure default NFS settings:

/etc/default/nfs-kernel-server
NEED_SVCGSSD=no # no is default

and

/etc/default/nfs-common
NEED_IDMAPD=yes
NEED_GSSD=no # no is default

Fault Tolerant daemon configuration

There are two packages that can handle high available services on Linux: corosync and heartbeat. Personally I prefer heartbeat and provide instructions referring to this, but most configurations will be through the pacemaker, then you are perfectly free to opt for corosync.

First install the needed packages:

apt-get install heartbeat pacemaker

and configure heartbeat daemon:

/etc/ha.d/ha.cf
autojoin none
bcast bond0
warntime 3
deadtime 6
initdead 60
keepalive 1
node cluster-cc01
node cluster-cc02
crm respawn

Only on first server, we create the authkeys file, and will copy it on the second server:

( echo -ne "auth 1\n1 sha1 "; \
  dd if=/dev/urandom bs=512 count=1 | openssl md5 ) \
  > /etc/ha.d/authkeys
chmod 0600 /etc/ha.d/authkeys
scp /etc/ha.d/authkeys cloud-cc02:/etc/ha.d/
ssh cloud-cc02 chmod 0600 /etc/ha.d/authkeys
/etc/init.d/heartbeat restart
ssh cloud-02 /etc/init.d/heartbeat restart

After a minute or two, heartbeat will be online:

crm_mon -1 | grep Online
Online: [ cloud-cc0 cloud-cc02 ]

Now we’ll configure cluster services via pacemaker.
Setting default options:

crm configure
property no-quorum-policy=ignore
property stonith-enabled=false
property default-resource-stickiness=1000
commit
bye

The two shared IP 172.16.0.250 and 172.17.0.250:

crm configure
primitive lan_ip IPaddr params ip=172.16.0.250 cidr_netmask="255.255.255.0" nic="eth2" op monitor interval="40s" timeout="20s"
primitive san_ip IPaddr params ip=172.17.0.250 cidr_netmask="255.255.255.0" nic="bond0" op monitor interval="40s" timeout="20s"
commit
bye

The NFS export:

crm configure
primitive drbd_one ocf:linbit:drbd params drbd_resource="one-disk" op monitor interval="40s" timeout="20s"
ms ms_drbd_one drbd_one meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
commit
bye

The one-disk mount:

crm configure
primitive fs_one ocf:heartbeat:Filesystem params device="/dev/drbd/by-res/one-disk" directory="/srv/cloud/one" fstype="ext4"
commit
bye

The AoE export:

crm configure
primitive drbd_data ocf:linbit:drbd params drbd_resource="data-disk"  op monitor interval="40s" timeout="20s"
ms ms_drbd_data drbd_data meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
commit
bye

The data-disk mount:

crm configure
primitive aoe_data ocf:heartbeat:AoEtarget params device="/dev/drbd/by-res/data-disk" nic="bond0" shelf="0" slot=="0" op monitor interval="40s" timeout="20s"
commit
bye

Now we have to configure the correct order to startup services:

crm configure
group ha_group san_ip lan_ip fs_one nfs_one aoe_data
colocation ha_col inf: ha_group ms_drbd_one:Master ms_drbd_data:Master
order ha_after_drbd inf: ms_drbd_one:promote ms_drbd_data:promote ha_group:start
commit
bye

We will modify this configuration later to add OpenNebula and lighttpd startup.

LVM Configuration

LVM2 will allow us to create partitions for virtual machines and deploy it via snapshot basis.

Install the package on both machines.

apt-get install lvm2

We have to modify the filter configuration to allow lvm scan only to DRBD disk.

/etc/lvm/lvm.conf
...
filter = [ "a|drbd.*|", "r|.*|" ]
...
write_cache_state = 0

ATTENTION: Ubuntu uses a Ramdisk to bootup the system, so we have to modify also lvm.conf file inside ramdisk.

Now we remove the cache:

rm /etc/lvm/cache/.cache

Only on server 1 we have to create physical LVM volume and Volume Group:

pvcreate /dev/drbd/by-res/data-disk
vgcreate one-data /dev/drbd2

Install and configure OpenNebula

We are almost done. Now we download and install OpenNebula 2.2 via source:

First we have to install prerequisites:

apt-get install libsqlite3-dev libxmlrpc-c3-dev scons g++ ruby libopenssl-ruby libssl-dev ruby-dev make rake rubygems libxml-parser-ruby1.8 libxslt1-dev libxml2-dev genisoimage  libsqlite3-ruby libsqlite3-ruby1.8 rails thin
gem install nokogiri
gem install json
gem install sinatra
gem install rack
gem install thin
cd /usr/bin
ln -s rackup1.8 rackup

Then we have to create OpenNebula user and group:

groupadd cloud
useradd -d /srv/cloud/one  -s /bin/bash -g cloud -m oneadmin
chown -R oneadmin:cloud /srv/cloud/
chmod 775 /srv
id oneadmin # we have to use this id also on cluster node for oneadmin/cloud

Now we go in unpriviledged mode to create ssh certificate for cluster communications:

su - oneadmin
ssh-keygen # use default
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chown 640 ~/.ssh/authorized_keys
mkdir  ~/.one

We create a .profile file with default variables:

~/.profile
export ONE_AUTH='/srv/cloud/one/.one/one_auth'
export ONE_LOCATION='/srv/cloud/one'
export ONE_XMLRPC='http://localhost:2633/RPC2'
export PATH=$PATH':/srv/cloud/one/bin'

Now we have to create one_auth file to setup a default user inside OpenNebula (for example of api or sunstone):

~/.one/one_auth
oneadmin:password

And load default variables before compile:

source .profile

Now download and install OpenNebula:

cd
wget http://dev.opennebula.org/attachments/download/339/opennebula-2.2.tar.gz
tar zxvf opennebula-2.2.tar.gz
cd opennebula-2.2
scons -j2 mysql=yes
./install.sh -d /srv/cloud/one

About configuration: this is my oned.conf file, I use Xen HyperVisor, but you can use also KVM.

/src/cloud/one/etc/oned.conf
HOST_MONITORING_INTERVAL = 60

VM_POLLING_INTERVAL      = 60

VM_DIR=/srv/cloud/one/var

SCRIPTS_REMOTE_DIR=/var/tmp/one

PORT=2633

DB = [ backend = "mysql",
       server  = "localhost",
       port    = 0,
       user    = "oneadmin",
       passwd  = "oneadmin",
       db_name = "opennebula" ]

VNC_BASE_PORT = 5900

DEBUG_LEVEL=3

NETWORK_SIZE = 254

MAC_PREFIX   = "02:ab"

IMAGE_REPOSITORY_PATH = /srv/cloud/one/var/images
DEFAULT_IMAGE_TYPE    = "OS"
DEFAULT_DEVICE_PREFIX = "sd"

IM_MAD = [
    name       = "im_xen",
    executable = "one_im_ssh",
    arguments  = "xen" ]

VM_MAD = [
    name       = "vmm_xen",
    executable = "one_vmm_ssh",
    arguments  = "xen",
    default    = "vmm_ssh/vmm_ssh_xen.conf",
    type       = "xen" ]

TM_MAD = [
    name       = "tm_lvm",
    executable = "one_tm",
    arguments  = "tm_lvm/tm_lvm.conf" ]

HM_MAD = [
    executable = "one_hm" ]

VM_HOOK = [
    name      = "image",
    on        = "DONE",
    command   = "image.rb",
    arguments = "$VMID" ]

HOST_HOOK = [
    name      = "error",
    on        = "ERROR",
    command   = "host_error.rb",
    arguments = "$HID -r n",
    remote    = "no" ]

VM_HOOK = [
   name      = "on_failure_resubmit",
   on        = "FAILED",
   command   = "/usr/bin/env onevm resubmit",
   arguments = "$VMID" ]

The only important thing is to modify /srv/cloud/one/etc/tm_lvm/tm_lvm.rc setting default VG:

/srv/cloud/one/etc/tm_lvm/tm_lvm.rc
...
VG_NAME=one-data
...

Now copy the init.d script from source to /etc/init.d but not set it to startup ad boot.

I have modified the default script to startup also sunstone:

/etc/init.d/one
#! /bin/sh
### BEGIN INIT INFO
# Provides:          opennebula
# Required-Start:    $remote_fs
# Required-Stop:     $remote_fs
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: OpenNebula init script
# Description:       OpenNebula cloud initialisation script
### END INIT INFO

# Author: Soren Hansen - modified my Alberto Zuin

PATH=/sbin:/usr/sbin:/bin:/usr/bin:/srv/cloud/one
DESC="OpenNebula cloud"
NAME=one
SUNSTONE=/srv/cloud/one/bin/sunstone-server
DAEMON=/srv/cloud/one/bin/$NAME
DAEMON_ARGS=""
PIDFILE=/var/run/$NAME.pid
SCRIPTNAME=/etc/init.d/$NAME

# Exit if the package is not installed
[ -x "$DAEMON" ] || exit 0

# Load the VERBOSE setting and other rcS variables
. /lib/init/vars.sh

# Define LSB log_* functions.
# Depend on lsb-base (>= 3.0-6) to ensure that this file is present.
. /lib/lsb/init-functions

#
# Function that starts the daemon/service
#
do_start()
{
mkdir -p /var/run/one /var/lock/one
chown oneadmin /var/run/one /var/lock/one
su - oneadmin -s /bin/sh -c "$DAEMON start"
su - oneadmin -s /bin/sh -c "$SUNSTONE start"
}

#
# Function that stops the daemon/service
#
do_stop()
{
su - oneadmin -s /bin/sh -c "$SUNSTONE stop"
su - oneadmin -s /bin/sh -c "$DAEMON stop"
}

case "$1" in
start)
[ "$VERBOSE" != no ] && log_daemon_msg "Starting $DESC" "$NAME"
do_start
case "$?" in
0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
esac
;;
stop)
[ "$VERBOSE" != no ] && log_daemon_msg "Stopping $DESC" "$NAME"
do_stop
case "$?" in
0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
esac
;;
restart|force-reload)
#
# If the "reload" option is implemented then remove the
# 'force-reload' alias
#
log_daemon_msg "Restarting $DESC" "$NAME"
do_stop
case "$?" in
0|1)
do_start
case "$?" in
0) log_end_msg 0 ;;
1) log_end_msg 1 ;; # Old process is still running
*) log_end_msg 1 ;; # Failed to start
esac
;;
*)
# Failed to stop
log_end_msg 1
;;
esac
;;
*)
echo "Usage: $SCRIPTNAME {start|stop|restart|force-reload}" >&2
exit 3
;;
esac

:

and set it with execute permissions:

chmod 755 /etc/init.d/one

Configuring the HTTPS proxy for Sunstone

Sunstone is the web interface for Cloud administration, if you do not want to use the command line… works on port 4567 and is not encrypted, so we’ll use lighttpd for proxy requests to HTTPS encrypted connection.

First install the daemon:

apt-get install ssl-cert lighttpd

Then generate certificates:

/usr/sbin/make-ssl-cert generate-default-snakeoil
cat /etc/ssl/private/ssl-cert-snakeoil.key /etc/ssl/certs/ssl-cert-snakeoil.pem > /etc/lighttpd/server.pem

and create symlinks to enable ssl and proxy modules:

ln -s /etc/lighttpd/conf-available/10-ssl.conf /etc/lighttpd/conf-enabled/
ln -s /etc/lighttpd/conf-available/10-proxy.conf /etc/lighttpd/conf-enabled/

And modify lighttp setup to enable proxy to sunstone:

/etc/lighttpd/conf-available/10-proxy.conf
proxy.server               = ( "" =>
                                ("" =>
                		(
                                 "host" => "127.0.0.1",
                                 "port" => 4567
                                )
                                )
                            )

Starting LightHTTP and OpenNebula with heartbeat

Now add the startup script automatically to heartbeat. First all stop heartbeat on both servers:

crm node
standby cloud-cc01
standby cloud-cc02
bye

Then we can change the configuration:

crm configure
primitive OpenNebula lsb:one
primitive lighttpd lsb:lighttpd
delete ha_group
group ha_group san_ip lan_ip fs_one nfs_one aoe_data OpenNebula lighttpd
colocation ha_col inf: ha_group ms_drbd_one:Master ms_drbd_data:Master
order ha_after_drbd inf: ms_drbd_one:promote ms_drbd_data:promote ha_group:start
commit
bye

And startup the cluster again:

crm node
online cloud-cc01
online cloud-cc02
bye

That’s all folks!
Thanks,
Alberto Zuin – http://www.anzs.it

OpenNebula shared storage with MooseFS

When running many VMs with persistent images, there is the need to have a shared storage behind OpenNebula hosts, with the purpose of faster recovery in case of host failure. However, SAN are expensive, and an NFS server or NAS can’t provide either performance or fault-tolerance.

A distributed fault-tolerant network filesystem takes easily place in this gap. This alternative provides shared storage without the need of a dedicated storage hardware and fault-tolerance capabilities by replicating your data across different nodes.

I am working at LiberSoft, and we evaluated the usage of two different opensource distributed filesystem, MooseFS and GlusterFS. A third choice could be Ceph, which is currently under heavy development and probably not so production-ready, but it certainly would be a good alternative in the near future.

Our choice fell on MooseFS because of its great expandability (you can add how many disks you want, any size you prefer) and its web monitor where you can easily check the status of your shared storage (replication status or disk errors). So we published on the Ecosystem section a new transfer manager and some basic instructions to get it working together with OpenNebula.

We had promising results during the testing deployment of 4 nodes (Gateway with 2x Xeon X3450, 12GB ram, 2x2TB SATA2 disks) for a private cloud at National Central Library of Florence (Italy), that will grow hence most Windows and Linux servers will get on the cloud in the next few months.

The requirements on this project were to use ordinary and affordable hardware and open-source software to avoid any possible vendor lock-in, with the purpose to lower energy consumption and hardware maintenance costs.

OpenNebula in the EU Initiative to Integrate Cloud with Grid

Researchers from a collaboration of six European organisations have attracted funding worth €2.3million to develop a new Internet-based software project called StratusLab. The two year project, headed up by Project Coordinator Dr Charles Loomis from CNRS, was launched in Paris on the 14th of June 2010. It aims to enhance distributed computing infrastructures, such as the European Grid Infrastructure (EGI), that allow research and higher education institutes from around the world to pool computing resources.

Funded through the European Union Seventh Framework Programme (FP7), the two year project aims to successfully integrate ‘cloud computing’ technologies into ‘grid’ infrastructures. Grids link computers and data that are scattered across the globe to work together for common goals, whilst cloud computing makes software platforms or virtual servers available as a service over the Internet, usually on a commercial basis, and provides a way for organisations to access computing capacity without investing directly in new infrastructure. Behind cloud services are data centres that typically house large numbers of processors and vast data storage systems. Linking grid and cloud technologies will result in major benefits for European academic research and is part of the European Commission strategy to develop European computing infrastructures.

StratusLab will integrate, distribute and maintain a sustainable open-source cloud distributionto bring cloud to existing and new grid sites. The StratusLab toolkit will be composed of existing cutting edge open source software, and the innovative service and cloud management technologies developed in the project. The StratusLab toolkit will integrate OpenNebula, the leading open-source toolkit for cloud computing. OpenNebula is a cloud management tool that is widely used in several grid and HPC sites.

Speaking about the project, Project Coordinator Dr Charles Loomis said: “Computer grids are used by thousands of researchers in many scientific fields. For example, the data from the Large Hadron Collider’s experiments, the world’s largest and highest-energy particle accelerator situated at CERN in Switzerland, are distributed via an international grid infrastructure to be processed at institutes around Europe and the world. The StratusLab toolkit will make the grid easier to manage and will allow grids to tap into commercial cloud services to meet peak demands. Later it will allow organisations that already provide a grid service to offer a cloud service to academic users, whilst retaining the many benefits of the grid approach.”

The StratusLab project will bring several benefits to the distributed computing infrastructure ecosystem including simplified management, added flexibility, increased maintainability, quality, energy efficiency and resilience of computing sites. It will benefit a wide variety of users from scientists, who can use the systems to run scientific analyses, to system administrators and hardware technicians, who are responsible for running grid services and maintaining the hardware and infrastructure at various resource centres.

The StratusLab project brings together six organisations, all key players with recognised leadership, proven expertise, experience and skills in grid and cloud computing. This collaboration presents a balanced combination of academic, research and industrial institutes with complementary capabilities. The participating organisations include the Centre National de la Recherche Scientifique (CNRS), France; the DSA-Research Group at Universidad Complutense de Madrid, Spain; the Greek Research and Technology Network S.A., Greece; SixSq Sárl, Switzerland; Telefonica Investigacion y Desarrollo, Spain, and Trinity College Dublin, Ireland.

About the StratusLab Project

The StratusLab project consists of numerous collaborators from six European research institutions. A website can be accessed via the following address: www.stratuslab.eu. The project is partially funded by the European Commission through the Grant Agreement RI-261552.

About OpenNebula

OpenNebula is the most advanced open-source toolkit for building private, public and hybrid clouds, offering unique features for cloud management and providing the integration capabilities that many enterprise IT shops need for internal cloud. OpenNebula is the result of many years of research and development in efficient and scalable management of virtual machines on large-scale distributed infrastructures. The technology has been designed to address the requirements of business use cases from leading companies in the context of flagship international projects in cloud computing. For more info: http://www.OpenNebula.org

About European Union Framework Programme 7

The Seventh Framework Programme (FP7) bundles all research-related EU initiatives together under a common roof playing a crucial role in reaching the goals of growth, competitiveness and employment. The framework programme runs a number of programmes under the headings Cooperation, Ideas, People and Capacities. All specific programmes work together to promote and encourage the creation of European poles of scientific excellence. More information on FP7 can be obtained from http://cordis.europa.eu/fp7/home_en.html.

OpenNebula Documentation in PDF

C12G Labs is happy to announce that the OpenNebula guides are now available in PDF format from the OpenNebula Ecosystem. The following guides are available:

  • Private Cloud Computing with OpenNebula 1.4
  • Public Cloud Computing with OpenNebula 1.4
  • Hybrid Cloud Computing with OpenNebula 1.4
  • OpenNebula 1.4 Reference Guide

OpenNebula users can benefit from these guides, since they can take advantage of having all the information bundled in well organized and easily accessible guides, which are very suitable for offline reference and for printing.