This Newsletter contains the most worthy developments and events of the OpenNebula project and the community during this last month, and the plans for the upcoming months.

Technology

An important milestone was reached this month with the publication of the Open Cloud Reference Architecture for Basic and Advanced cloud deployments. This Reference Architecture has been created from the collective information and experiences from hundreds of users and cloud client engagements. Besides main logical components and interrelationships, the document describes software products, configurations, and requirements of infrastructure platforms recommended for a smooth OpenNebula installation.

A maintenance release for Cotton Candy,4.12.1, was released this month by the OpenNebula Team. This release comes with several bug fixes found after the 4.12 release. These bug fixes covers different OpenNebula components, like for instance the scheduler, the Cloud View self service portal, Sunstone web interface, OpenNebula Core and several drivers (VM, Auth, Network). Besides the bug fixes mentioned above, 4.12.1 includes several improvements, like the ability to have VNC capabilities imported from vCenter VMs, a logrotate script for OpenNebula logs and, specially, the scheduler has been revisited to cope with large XML files. Now OpenNebula instances are able to manage even more VMs!

Also April saw the release of a new stable version of vOneCloud, the open replacement of vCloud to cloudify vSphere infrastructures. vOneCloud 1.4 comes with outstanding new features for vCenter resource management, like for instance the inclusion of Showback capabilities. The VDC model has been revisited to enable resource sharing easily among different groups, as well as the interfaces, in order to smooth the workflow of importing vCenter resources. The main highlight though is the addition of multi-vm management capabilities, enabling the management of services, including the ability to set up elasticity rules to automatically increase or decrease the number of nodes composing a service. vOneCloud is zero intrusive, try it out with without the need to commit to it!

Community

Technical posts are definitely our sort of thing. And if they come as detailed and rounded as this amazing piece showing how to install OpenNebula on HA with Ceph and IPoIB in CentOS, the better. Check it out for a awesome script laying out all the steps needed to reproduce this setup. Another example is this excellent post about securing noVNC connections for Sunstone, with detailed explanation on how to create your own CA for VNC connections.

In depth analysis of the OpenNebula technology by third parties is a great way to promote OpenNebula, giving users leeway to choose among the IaaS technology that better fit their needs. This blog post by OlinData is a great example.

Our community is always giving back, and that is the spirit of open source. For instance, people sharing his work to make other people’s life easier, is a great example of the health of OpenNebula community. This new addon enabling the integration of OpenNebula and StorPool broads the integration capabilities of OpenNebula. Also important this example of a nodejs boilerplate to interact with OpenNebula. Thanks!

As you may know, OpenNebula is participating in the BEACON project, flagship European project in federated cloud networking. You can check the profile of OpenNebula Systems in the project blog.

We run on feedback. Seriously, it is never enough. If you are doing an OpenNebula deployment we want to hear from you! As they say, through thick and thin. Have you just installed a Windows 10 VM using OpenNebula?. We want to hear from you!

Outreach

The upcoming third edition of the OpenNebulaConf will be held in Barcelona this October 2015.You are still in time for getting a good price deal for tickets. Also, your company may be interested in the sponsorship opportunities for OpenNebulaConf 2015.

We have two Cloud Technology Days planned for US, in Chicago and Boston, for the end of June. We will publish the details in a few days. If there is anyone interested to host a TechDay in the east coast (in particular, we are looking for hosts in New York), drop us a line.

During the following months, members of the OpenNebula team will be speaking in the following events:

If you are interested in receiving OpenNebula training, check the schedule for 2015 public classes at OpenNebula Headquarters. Please contact us if your would like to request training near you.

Remember that you can see slides and resources from past events in our Events page. We have also created a Slideshare account where you can see the slides from some of our recent presentations.

The first OpenNebula Cloud Tech Day of the Northeast USA tour will be held in Cambridge, MA, at the Microsoft New England R&D Center, organized by the HPC & GPU Supercomputing Group of Boston and sponsored by Microway,

microsoft_logo

 

microway_logo

 

 

 

The event will start on the 29th of June at 9:00 with a hands-on cloud installation and operation workshop, and will continue with presentations from OpenNebula community members and users, and related open-source projects. The page of the TechDay contains all the details about the event.

If you want to actively participate in this event, share your experience with OpenNebula or describe other related cloud open-source projects and tools, send us your talk proposal at events@opennebula.org.

The number of seats is limited to ensure there is plenty of opportunity for everyone to interact. We encourage everyone to register as early as possible.

If you want to organize an OpenNebula TechDay in another city during our Northeast USA tour this is your chance! If you are interested or want to have more information please send an email to contact@opennebula.org.

We hope to see you there! and a big thanks to the HPC & GPU Supercomputing Group of Boston for making OpenNebula Tech Day possible.

When dealing with NoVNC connections, I’ve faced some problems as a newbie, so today I’m sharing with you this post that may help you.

If you’re already using SSL to secure Sunstone’s access you could get an error when opening a VNC window: VNC Connection in progress”It’s quite possible that your browser is silently blocking the VNC connection using websockets. Reason? You’re using an https connection with Sunstone, but you’re trying to open an uncrypted websocket connection.

VNC_Connection_In_Progress

This is solved easily, just edit the following lines in the # UI Settings section in your /etc/one/sunstone-server.conf configuration file:

:vnc_proxy_support_wss: yes
:vnc_proxy_cert: /etc/one/certs/one-tornasol.crt
:vnc_proxy_key: /etc/one/certs/one-tornasol.key

We’ve just activated the secure websockets (wss) options and tell Sunstone where to find the SSL certificate and the key (if it’s not already included in the cert). Now, just restart your Sunstone server.

 

There’s another issue with VNC and SSL when using self-signed certificates. When running your own lab or using a development environment maybe you don’t have an SSL certificate signed by a real CA and you opt to use self-signed certificates which are quick and free to use… but this has some drawbacks

Trying to protect you from security threats, your Internet browser could have problems with secure websockets and self-signed certificates and messages like “VNC Disconnect timeout” and VNC Server disconnected (code: 1006)” could show.

VNC_Disconnected

In my labs I just use the openssl command (available in CentOS/Redhat and Debian/Ubuntu in the openssl package) to generate my own Certificate Authority certificate and sign the SSL certificates.

First we’ll create the /etc/one/certs directory in my Frontend and set the right owner:

mkdir -p /etc/one/certs
chown -R oneadmin:oneadmin /etc/one/certs

We’ll generate an RSA key with 2048 bits for the CA:

openssl genrsa -out /etc/one/certs/oneCA.key 2048

Now, we’ll produce the CA certificate using the key we’ve just created, and we’ll have to answer some questions to identify our CA (e.g my CA will be named ArtemIT Labs CA). Note that this CA certificate will be valid for 3650 days, 10 years!…

openssl req -x509 -new -nodes -key /etc/one/certs/oneCA.key -days 3650 -out /etc/one/certs/oneCA.pem

You are about to be asked to enter information that will be incorporated into your certificate request.

What you are about to enter is what is called a Distinguished Name or a DN.

There are quite a few fields but you can leave some blank

For some fields there will be a default value,

If you enter '.', the field will be left blank.
----
Country Name (2 letter code) [XX]:ES
State or Province Name (full name) []:Valladolid
Locality Name (eg, city) [Default City]:Valladolid
Organization Name (eg, company) [Default Company Ltd]:ArtemIT Labs
Organizational Unit Name (eg, section) []:
Common Name (eg, your name or your server's hostname) []:ArtemIT Labs CA
Email Address []:

Now, we already have a CA certificate and a key to sign SSL certificates. Time to generate the SSL certificate for WSS connections.

First, we’ll create the key for the Frontend, then we’ll generate the certificate answering some questions. In this example my Frontend server is called tornasol.artemit.local and I’ve set no challenge password for the certificate.

openssl genrsa -out /etc/one/certs/one-tornasol.key 2048


openssl req -new -key /etc/one/certs/one-tornasol.key -days 3650 -out /etc/one/certs/one-tornasol.csr

You are about to be asked to enter information that will be incorporated into your certificate request.

What you are about to enter is what is called a Distinguished Name or a DN.

There are quite a few fields but you can leave some blank

For some fields there will be a default value,

If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [XX]:ES
State or Province Name (full name) []:Valladolid
Locality Name (eg, city) [Default City]:Valladolid
Organization Name (eg, company) [Default Company Ltd]:ArtemIT Labs
Organizational Unit Name (eg, section) []:
Common Name (eg, your name or your server's hostname) []:tornasol.artemit.local
Email Address []:
Please enter the following 'extra' attributes to be sent with your certificate request
A challenge password []:
An optional company name []:

If everything is fine you’ll have the certs and keys under /etc/one/certs.

Now we’ll copy the oneCA.pem file to the computers where I’ll use my browser to open the Sunstone GUI.

In Firefox we’ll import the oneCA.pem (the CA certificate file) using Preferences -> Advanced -> Certificates -> Authorities tab checking all the options as shown in this image. If using Chrome under Linux it’s the same process when importing your CA cert.

trust_ca_firefox

If using IE or Chrome under Windows, change the extension from pem to crt, double-click the certificate and add the Certificate to the Trusted Root Certification Authorities storage. Some warnings will show, just accept them.

Once we trust our CA certificate, you can open your encrypted NoVNC windows.

Captura de pantalla de 2015-04-25 15:06:08

Free, quick and secure for your lab environment, but remember don’t do this in a production environment! 

Cheers!

We want you to know that OpenNebula Systems have just announced the availability of vOneCloud version, 1.4.

Several exciting features have been introduced in vOneCloud 1.4. The appliance that helps you turn your vSphere infrastructure into a private cloud generates daily reports that can be consulted by every user to check their resource consumption, with associated costs defined by the Cloud Administrator. The Virtual Datacenter provisioning model has been revisited to enable resource sharing easily among different groups, as well as to simplify configuration. The interfaces has also been improved to smooth the workflow of importing vCenter resources via the vOneCloud web interface, Sunstone. But probably most importantly, vOneCloud 1.4 add multi-vm management capabilities, enabling the management of sets of interconnected VMs (services), including the ability to set up elasticity rules to automatically increase or decrease the number of nodes composing a service, according to easily programmed rules that take into account the service demands.

Improvements were also in place for Control Panel, a web interface that eases the configuration of vOneCloud services and enables one click smooth upgrades to newer versions, introducing features to aid in the troubleshooting of the appliance.

The above features and components add to the already present ability to expose a multi-tenant cloud-like provisioning layer through the use of virtual datacenters, self-service portal, or hybrid cloud computing to connect in-house vCenter infrastructures with public clouds. vOneCloud seamlessly integrates with running vCenter virtualized infrastructures, leveraging advanced features such as vMotion, HA or DRS scheduling provided by the VMware vSphere product family.

vOneCloud is zero intrusive, try it out with without the need to commit to it. If you happen to don’t like it  just remove the appliance!

Relevant Links

 

We are excited to announce the release of the first version of the Open Cloud Reference Architecture. The OpenNebula Reference Architecture is a blueprint to guide IT architects, consultants, administrators and field practitioners in the design and deployment of public and private clouds fully based on open-source platforms and technologies. This Reference Architecture has been created from the collective information and experiences from hundreds of users and cloud client engagements. Besides main logical components and interrelationships, this reference documents software products, configurations, and requirements of infrastructure platforms recommended for a smooth OpenNebula installation. Three optional functionalities complete the architecture: high availability, cloud bursting for workload outsourcing, and federation of geographically dispersed data centers.

The document describes the reference architecture for Basic (small to medium-scale) and Advanced (medium to large-scale) OpenNebula Clouds and provides recommended software for main architectural components, and the rationale behind the recommendations. Each section also provides information about other open-source infrastructure platforms tested and certified by OpenNebula to work in enterprise environments. To complement these certified components, the OpenNebula add-on catalog can be browsed for other options supported by the community and partners. Moreover, there are other components in the open cloud ecosystem that are not part of the reference architecture, but are nonetheless important to consider at the time of designing a cloud, like for example Configuration Management and Automation Tools for configuring cloud infrastructure and manage large number of devices.

You can download a copy from the Jumpstart Packages page at the OpenNebula Systems web site.

Thank you!

Introduction.

This article is exploring the process of installing HA OpenNebula and Ceph as datastore on three nodes (disks – 6xSSD 240GB, backend network IPoIB, OS CentOS 7) and using one additional node for backup.

Scheme of equipment below:
.

We are using this solution for virtualization of our imagery processing servers.

Preparing.

All actions should be performed on all nodes. For kosmo-arch all except bridge-utils and FrontEnd network.

yum install bridge-utils

FrontEnd network.

Configure bond0 (mode0) and start script below to create frontend interface for VMs (OpenNebula)

#!/bin/bash
Device=bond0
cd /etc/sysconfig/network-scripts
if [ ! -f ifcfg-nab1 ]; then
cp -p ifcfg-$Device bu-ifcfg-$Device
  echo -e "DEVICE=$Device\nTYPE=Ethernet\nBOOTPROTO=none\nNM_CONTROLLED=no\nONBOOT=yes\nBRIDGE=nab1" > ifcfg-$Device
    grep ^HW bu-ifcfg-$Device >> ifcfg-$Device
      echo -e "DEVICE=nab1\nNM_CONTROLLED=no\nONBOOT=yes\nTYPE=bridge" > ifcfg-nab1 
        egrep -v "^#|^DEV|^HWA|^TYP|^UUI|^NM_|^ONB" bu-ifcfg-$Device >> ifcfg-nab1
fi

BackEnd network. Configuration of IPoIB:

yum groupinstall -y "Infiniband Support"
yum install opensm

Enable IPoIB and switch infiniband to connected mode. This Link about differences of connected or datagram modes.

 cat /etc/rdma/rdma.conf
# Load IPoIB
IPOIB_LOAD=yes
# Setup connected mode
SET_IPOIB_CM=yes

Start Infiniband services.

systemctl enable rdma opensm
systemctl start rdma opensm

Check of working

ibv_devinfo

hca_id: mlx4_0
      transport:                      InfiniBand (0)
      fw_ver:                         2.7.000
      node_guid:                      0025:90ff:ff07:3368
      sys_image_guid:                 0025:90ff:ff07:336b
      vendor_id:                      0x02c9
      vendor_part_id:                 26428
      hw_ver:                         0xB0
      board_id:                       SM_1071000001000
      phys_port_cnt:                  2
              port:   1
                      state:                  PORT_ACTIVE (4)
                      max_mtu:                4096 (5)
                      active_mtu:             4096 (5)
                      sm_lid:                 8
                      port_lid:               4
                      port_lmc:               0x00
                      link_layer:             InfiniBand
              port:   2
                      state:                  PORT_ACTIVE (4)
                      max_mtu:                4096 (5)
                      active_mtu:             4096 (5)
                      sm_lid:                 4
                      port_lid:               9
                      port_lmc:               0x00
                      link_layer:             InfiniBand

and

iblinkinfo
CA: kosmo-virt1 mlx4_0:
    0x002590ffff073385     13    1[  ] ==( 4X          10.0 Gbps Active/  LinkUp)==>       2   10[  ] "Infiniscale-IV Mellanox Technologies" ( )
Switch: 0x0002c90200482d08 Infiniscale-IV Mellanox Technologies:
         2    1[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    2[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    1[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    4[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    5[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    6[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    7[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    8[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    9[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   10[  ] ==( 4X          10.0 Gbps Active/  LinkUp)==>      13    1[  ] "kosmo-virt1 mlx4_0" ( )
         2   11[  ] ==( 4X          10.0 Gbps Active/  LinkUp)==>       4    1[  ] "kosmo-virt2 mlx4_0" ( )
         2   12[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   13[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   14[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   15[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   16[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   17[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   18[  ] ==(                Down/ Polling)==>             [  ] "" ( )
CA: kosmo-virt2 mlx4_0:
    0x002590ffff073369      4    1[  ] ==( 4X          10.0 Gbps Active/  LinkUp)==>       2   11[  ] "Infiniscale-IV Mellanox Technologies" ( )

Setup bond1 (mode1) of two IB interfaces. Set up IP 172.19.254.X where X is node number. Example below:

 cat /etc/modprobe.d/bonding.conf
 alias bond0 bonding
 alias bond1 bonding
 cat /etc/sysconfig/network-scripts/ifcfg-bond1
 DEVICE=bond1
 TYPE=bonding
 BOOTPROTO=static
 USERCTL=no
 ONBOOT=yes
 IPADDR=172.19.254.x
 NETMASK=255.255.255.0
 BONDING_OPTS="mode=1 miimon=500 primary=ib0"
 MTU=65520

Disable firewall

Tuning sysctl.

net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.core.rmem_default=16777216
net.core.wmem_default=16777216
net.core.optmem_max=16777216
net.ipv4.tcp_mem=16777216 16777216 16777216
net.ipv4.tcp_rmem=4096 87380 16777216
net.ipv4.tcp_wmem=4096 65536 16777216

Installing Ceph.

Preparation

Configure passwordless access between nodes for user root. The key shoud be created on one node and then copy to other to /root/.ssh/.

ssh-keygen -t dsa (creation of passwordless key)
cd /root/.ssh
cat id_dsa.pub >> authorized_keys
chown root.root authorized_keys
chmod 600 authorized_keys
echo "StrictHostKeyChecking no" > config

Disable Selinux on all nodes

In /etc/selinux/config
SELINUX=disabled

setenforce 0

Add max open files to /etc/security/limits.conf (depends on your requirements) on all nodes

  • hard nofile 1000000
  • soft nofile 1000000

Setup /etc/hosts on all nodes:

172.19.254.1 kosmo-virt1
172.19.254.2 kosmo-virt2
172.19.254.3 kosmo-virt3  
172.19.254.150 kosmo-arch
192.168.14.42 kosmo-virt1
192.168.14.43 kosmo-virt2
192.168.14.44 kosmo-virt3  
192.168.14.150 kosmo-arch

Installing

Install kernel >3.15 on all nodes (That is needed for using cephFS client)

rpm -ivh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
yum --enablerepo=elrepo-kernel install kernel-ml -y

Set up new kernel for booting.

grep ^menuentry /boot/grub2/grub.cfg 
grub2-set-default 0 # number of our kernel
grub2-editenv list
grub2-mkconfig -o /boot/grub2/grub.cfg

Reboot.

Set up repository: (on all nodes)

 cat << EOT > /etc/yum.repos.d/ceph.repo
 [ceph]
 name=Ceph packages for $basearch
 baseurl=http://ceph.com/rpm/el7/$basearch
 enabled=1
 gpgcheck=1
 type=rpm-md
 gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc
 
 [ceph-noarch]
 name=Ceph noarch packages
 baseurl=http://ceph.com/rpm/el7/noarch
 enabled=1
 gpgcheck=1
 type=rpm-md
 gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc
 EOT

Import gpgkey: (on all nodes)

 rpm --import 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc'

Setup ntpd. (on all nodes)

yum install ntp

Editing /etc/ntp.conf and start ntpd. (on all nodes)

systemctl enable ntpd
systemctl start ntpd

Install: (on all nodes)

yum install libunwind -y
yum install -y  ceph-common ceph ceph-fuse ceph-deploy

Deploying.

(on kosmo-virt1) 
cd /etc/ceph
ceph-deploy new kosmo-virt1 kosmo-virt2 kosmo-virt3

MON deploying: (on kosmo-virt1)

ceph-deploy  mon create-initial

OSD deploying:

(on kosmo-virt1)

 cd /etc/ceph
 ceph-deploy gatherkeys kosmo-virt1
 ceph-deploy disk zap kosmo-virt1:sdb
 ceph-deploy osd prepare kosmo-virt1:sdb
 ceph-deploy disk zap kosmo-virt1:sdc
 ceph-deploy osd prepare kosmo-virt1:sdc
 ceph-deploy disk zap kosmo-virt1:sdd
 ceph-deploy osd prepare kosmo-virt1:sdd
 ceph-deploy disk zap kosmo-virt1:sde
 ceph-deploy osd prepare kosmo-virt1:sde
 ceph-deploy disk zap kosmo-virt1:sdf
 ceph-deploy osd prepare kosmo-virt1:sdf
 ceph-deploy disk zap kosmo-virt1:sdg
 ceph-deploy osd prepare kosmo-virt1:sdg

(on kosmo-virt2)

 cd /etc/ceph
 ceph-deploy gatherkeys kosmo-virt2
 ceph-deploy disk zap kosmo-virt2:sdb
 ceph-deploy osd prepare kosmo-virt2:sdb
 ceph-deploy disk zap kosmo-virt2:sdc
 ceph-deploy osd prepare kosmo-virt2:sdc
 ceph-deploy disk zap kosmo-virt2:sdd
 ceph-deploy osd prepare kosmo-virt2:sdd
 ceph-deploy disk zap kosmo-virt2:sde
 ceph-deploy osd prepare kosmo-virt2:sde
 ceph-deploy disk zap kosmo-virt2:sdf
 ceph-deploy osd prepare kosmo-virt2:sdf
 ceph-deploy disk zap kosmo-virt2:sdg
 ceph-deploy osd prepare kosmo-virt2:sdg

(on kosmo-virt3)

 cd /etc/ceph
 ceph-deploy gatherkeys kosmo-virt3
 ceph-deploy disk zap kosmo-virt3:sdb
 ceph-deploy osd prepare kosmo-virt3:sdb
 ceph-deploy disk zap kosmo-virt3:sdc
 ceph-deploy osd prepare kosmo-virt3:sdc
 ceph-deploy disk zap kosmo-virt3:sdd
 ceph-deploy osd prepare kosmo-virt3:sdd
 ceph-deploy disk zap kosmo-virt3:sde
 ceph-deploy osd prepare kosmo-virt3:sde
 ceph-deploy disk zap kosmo-virt3:sdf
 ceph-deploy osd prepare kosmo-virt3:sdf
 ceph-deploy disk zap kosmo-virt3:sdg
 ceph-deploy osd prepare kosmo-virt3:sdg

where sd[b-g] – SSD disks.

MDS deploying:

New giant version of ceph doesn’t have osd pool data and metadata
Use ceph osd lspools to check.

 ceph osd pool create data 1024
 ceph osd pool set data min_size 1
 ceph osd pool set data size 2
 ceph osd pool create metadata 1024
 ceph osd pool set metadata min_size 1
 ceph osd pool set metadata size 2

Check pool id of data and metadata with

 ceph osd lspools

Configure FS

 ceph mds newfs 4 3 --yes-i-really-mean-it

where 4 – id metadata pool, 3 – id metadata pool

Configure MDS

(on kosmo-virt1)

 cd /etc/ceph
 ceph-deploy mds create kosmo-virt1

(on kosmo-virt2)

 cd /etc/ceph
 ceph-deploy mds create kosmo-virt2

(on all nodes)

 chkconfig ceph on

Configure kosmo-arch.

Copy /etc/ceph.conf and /etc/ceph.client.admin.keyring from any of kosmo-virt to kosmo-arch

Preparing Ceph for OpenNebula.

Create pool:

 ceph osd pool create one 4096
 ceph osd pool set one min_size 1
 ceph osd pool set one size 2

Setup authorization to pool one:

 ceph auth get-or-create client.oneadmin mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=one' > /etc/ceph/ceph.client.oneadmin.keyring

Get key from keyring:

  cat /etc/ceph/ceph.client.oneadmin.keyring | grep key | awk '{print $3}' >>  /etc/ceph/oneadmin.key

Checking:

 ceph auth list

Copy /etc/ceph/ceph.client.oneadmin.keyring and /etc/ceph/oneadmin.key to the second node.

Preparing for Opennebula HA

Configuring MariaDB cluster

Configure MariaDB cluster on all nodes except kosmo-arch

Setup repo:

 cat << EOT > /etc/yum.repos.d/mariadb.repo
 [mariadb]
 name = MariaDB
 baseurl = http://yum.mariadb.org/10.0/centos7-amd64
 gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
 gpgcheck=1
 EOT

Install:

 yum install MariaDB-Galera-server MariaDB-client rsync galera

start service:

 service mysql start
 chkconfig mysql on
 mysql_secure_installation

prepare for cluster:

 mysql -p
 GRANT USAGE ON *.* to sst_user@'%' IDENTIFIED BY 'PASS';
 GRANT ALL PRIVILEGES on *.* to sst_user@'%';
 FLUSH PRIVILEGES;
 exit
 service mysql stop

configuring cluster: (for kosmo-virt1)

 cat << EOT > /etc/my.cnf
 collation-server = utf8_general_ci
 init-connect = 'SET NAMES utf8'
 character-set-server = utf8
 binlog_format=ROW
 default-storage-engine=innodb
 innodb_autoinc_lock_mode=2
 innodb_locks_unsafe_for_binlog=1
 query_cache_size=0
 query_cache_type=0
 bind-address=0.0.0.0
 datadir=/var/lib/mysql
 innodb_log_file_size=100M
 innodb_file_per_table
 innodb_flush_log_at_trx_commit=2
 wsrep_provider=/usr/lib64/galera/libgalera_smm.so
 wsrep_cluster_address="gcomm://172.19.254.1,172.19.254.2,172.19.254.3"
 wsrep_cluster_name='scanex_galera_cluster'
 wsrep_node_address='172.19.254.1' # setup real node ip
 wsrep_node_name='kosmo-virt1' #  setup real node name
 wsrep_sst_method=rsync
 wsrep_sst_auth=sst_user:PASS
 EOT

(for kosmo-virt2)

 cat << EOT > /etc/my.cnf
 collation-server = utf8_general_ci
 init-connect = 'SET NAMES utf8'
 character-set-server = utf8
 binlog_format=ROW
 default-storage-engine=innodb
 innodb_autoinc_lock_mode=2
 innodb_locks_unsafe_for_binlog=1
 query_cache_size=0
 query_cache_type=0
 bind-address=0.0.0.0
 datadir=/var/lib/mysql
 innodb_log_file_size=100M
 innodb_file_per_table
 innodb_flush_log_at_trx_commit=2
 wsrep_provider=/usr/lib64/galera/libgalera_smm.so
 wsrep_cluster_address="gcomm://172.19.254.1,172.19.254.2"
 wsrep_cluster_name='scanex_galera_cluster'
 wsrep_node_address='172.19.254.2' # setup real node ip
 wsrep_node_name='kosmo-virt2' #  setup real node name
 wsrep_sst_method=rsync
 wsrep_sst_auth=sst_user:PASS
 EOT

(for kosmo-virt3)

 cat << EOT > /etc/my.cnf
 collation-server = utf8_general_ci
 init-connect = 'SET NAMES utf8'
 character-set-server = utf8
 binlog_format=ROW
 default-storage-engine=innodb
 innodb_autoinc_lock_mode=2
 innodb_locks_unsafe_for_binlog=1
 query_cache_size=0
 query_cache_type=0
 bind-address=0.0.0.0
 datadir=/var/lib/mysql
 innodb_log_file_size=100M
 innodb_file_per_table
 innodb_flush_log_at_trx_commit=2
 wsrep_provider=/usr/lib64/galera/libgalera_smm.so
 wsrep_cluster_address="gcomm://172.19.254.1,172.19.254.2,172.19.254.3"
 wsrep_cluster_name='scanex_galera_cluster'
 wsrep_node_address='172.19.254.2' # setup real node ip
 wsrep_node_name='kosmo-virt3' #  setup real node name
 wsrep_sst_method=rsync
 wsrep_sst_auth=sst_user:PASS
 EOT

(on kosmo-virt1)

 /etc/init.d/mysql start --wsrep-new-cluster

(on kosmo-virt2)

 /etc/init.d/mysql start

(on kosmo-virt3)

 /etc/init.d/mysql start

check on all nodes:

 mysql -p
 show status like 'wsrep%';

| Variable_name | Value | +——————————+————————————–+

wsrep_local_state_uuid 739895d5-d6de-11e4-87f6-3a3244f26574
wsrep_protocol_version 7
wsrep_last_committed 0
wsrep_replicated 0
wsrep_replicated_bytes 0
wsrep_repl_keys 0
wsrep_repl_keys_bytes 0
wsrep_repl_data_bytes 0
wsrep_repl_other_bytes 0
wsrep_received 6
wsrep_received_bytes 425
wsrep_local_commits 0
wsrep_local_cert_failures 0
wsrep_local_replays 0
wsrep_local_send_queue 0
wsrep_local_send_queue_max 1
wsrep_local_send_queue_min 0
wsrep_local_send_queue_avg 0.000000
wsrep_local_recv_queue 0
wsrep_local_recv_queue_max 1
wsrep_local_recv_queue_min 0
wsrep_local_recv_queue_avg 0.000000
wsrep_local_cached_downto 18446744073709551615
wsrep_flow_control_paused_ns 0
wsrep_flow_control_paused 0.000000
wsrep_flow_control_sent 0
wsrep_flow_control_recv 0
wsrep_cert_deps_distance 0.000000
wsrep_apply_oooe 0.000000
wsrep_apply_oool 0.000000
wsrep_apply_window 0.000000
wsrep_commit_oooe 0.000000
wsrep_commit_oool 0.000000
wsrep_commit_window 0.000000
wsrep_local_state 4
wsrep_local_state_comment Synced
wsrep_cert_index_size 0
wsrep_causal_reads 0
wsrep_cert_interval 0.000000
wsrep_incoming_addresses 172.19.254.1:3306,172.19.254.3:3306,172.19.254.2:3306
wsrep_evs_delayed
wsrep_evs_evict_list
wsrep_evs_repl_latency 0/0/0/0/0
wsrep_evs_state OPERATIONAL
wsrep_gcomm_uuid 7397d6d6-d6de-11e4-a515-d3302a8c2342
wsrep_cluster_conf_id 2
wsrep_cluster_size 2
wsrep_cluster_state_uuid 739895d5-d6de-11e4-87f6-3a3244f26574
wsrep_cluster_status Primary
wsrep_connected ON
wsrep_local_bf_aborts 0
wsrep_local_index 0
wsrep_provider_name Galera
wsrep_provider_vendor Codership Oy info@codership.com
wsrep_provider_version 25.3.9(r3387)
wsrep_ready ON
wsrep_thread_count 2

+——————————+————————————–+

Creating user and database:

mysql -p
create database opennebula;
GRANT USAGE ON opennebula.* to oneadmin@'%' IDENTIFIED BY 'PASS';
GRANT ALL PRIVILEGES on opennebula.* to oneadmin@'%';
FLUSH PRIVILEGES;

Remember, if all nodes will be down, actual node must be started with /etc/init.d/mysql start –wsrep-new-cluster. You should find an actual node. If you start node with not actual view, other nodes will issue error (see logs) – [ERROR] WSREP: gcs/src/gcs_group.cpp:void group_post_state_exchange(gcs_group_t*)():319: Reversing history: 0 → 0, this member has applied 140536161751824 more events than the primary component.Data loss is possible. Aborting.

Configuring HA cluster

Unfortunately pcs cluster conflicts with Opennebula server. That’s why will go with pacemaker,corosync and crmsh.

Installing HA

Set up repo on all nodes except kosmo-arch:

 cat << EOT > /etc/yum.repos.d/network\:ha-clustering\:Stable.repo
 [network_ha-clustering_Stable]
 name=Stable High Availability/Clustering packages (CentOS_CentOS-7)
 type=rpm-md
 baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/
 gpgcheck=1
 gpgkey=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/repodata/repomd.xml.key
 enabled=1
 EOT

Install on all nodes except kosmo-arch:

 yum install corosync pacemaker crmsh resource-agents -y

On kosmo-virt1 create configuration

 vi /etc/corosync/corosync.conf
 totem {
 version: 2  
 secauth: off
 cluster_name: cluster
 transport: udpu
 }
 nodelist {
 node {
      ring0_addr: kosmo-virt1
      nodeid: 1
     }
 node {
      ring0_addr: kosmo-virt2
      nodeid: 2
     }
  node {
      ring0_addr: kosmo-virt3
      nodeid: 3
     }
 }
 quorum {
 provider: corosync_votequorum
 }
 logging {
 to_syslog: yes
 }

and create authkey on kosmo-virt1

 cd /etc/corosync
 corosync-keygen

Copy corosync and authkey to kosmo-virt2 and kosmo-virt3

Enabling (on all nodes except kosmo-arch):

 systemctl enable pacemaker corosync

Starting (on all nodes except kosmo-arch):

 systemctl start pacemaker corosync

Checking:

 crm status
 
 Last updated: Mon Mar 30 18:33:14 2015
 Last change: Mon Mar 30 18:23:47 2015 via crmd on kosmo-virt2
 Stack: corosync
 Current DC: kosmo-virt2 (2) - partition with quorum
 Version: 1.1.10-32.el7_0.1-368c726
 3 Nodes configured
 0 Resources configured
 Online: [ kosmo-virt1 kosmo-virt2 kosmo-virt3]

add properies

crm configure property stonith-enabled=false
crm configure property no-quorum-policy=stop

Installing Opennebula

Installing

Setup repo on all nodes except kosmo-arch:

 cat << EOT > /etc/yum.repos.d/opennebula.repo
 [opennebula]
 name=opennebula
 baseurl=http://downloads.opennebula.org/repo/4.12/CentOS/7/x86_64/
 enabled=1
 gpgcheck=0
 EOT

Installing (on all nodes except kosmo-arch):

 yum install -y opennebula-server opennebula-sunstone opennebula-node-kvm qemu-img qemu-kvm

Ruby Runtime Installation:

 /usr/share/one/install_gems

Change password oneadmin:

 passwd oneadmin

Create passworless access for oneadmin (on kosmo-virt1):

 su oneadmin
 cd ~/.ssh
 ssh-keygen -t dsa
 cat id_dsa.pub >> authorized_keys
 chown oneadmin:oneadmin authorized_keys
 chmod 600 authorized_keys
 echo "StrictHostKeyChecking no" > config

Copy to other nodes (remember that oneadmin home directory is /var/lib/one).

Change listen for sunstone-server (on all nodes):

 sed -i 's/host:\ 127\.0\.0\.1/host:\ 0\.0\.0\.0/g' /etc/one/sunstone-server.conf

on kosmo-virt1:

copy all /var/lib/one/.one/*.auth and one.key files to OTHER_NODES:/var/lib/one/.one/

Start stop services on kosmo-virt1:

 
 systemctl start opennebula opennebula-sunstone

Try to connect to http://node:9869.
Check logs for errors (/var/log/one/oned.log /var/log/one/sched.log /var/log/one/sunstone.log).
If no errors:

 systemctl stop opennebula opennebula-sunstone

Add ceph support for qemu-kvm for all nodes except kosmo-arch

 qemu-img -h | grep rbd
 /usr/libexec/qemu-kvm --drive format=? | grep rbd

if there is no rbd support than you have to compile and install:

 qemu-kvm-rhev
 qemu-kvm-common-rhev 
 qemu-img-rhev

Download:

 yum groupinstall -y "Development Tools"
 yum install -y yum-utils rpm-build
 yumdownloader --source qemu-kvm
 rpm -ivh qemu-kvm-1.5.3-60.el7_0.11.src.rpm

Compiling.

 
 cd ~/rpmbuild/SPEC
 vi qemu-kvm.spec

Change %define rhev 0 to %define rhev 1.

 rpmbuild -ba qemu-kvm.spec

Installing (for all nodes except kosmo-arch).

 rpm -e --nodeps libcacard-1.5.3-60.el7_0.11.x86_64
 rpm -e --nodeps qemu-img-1.5.3-60.el7_0.11.x86_64
 rpm -e --nodeps qemu-kvm-common-1.5.3-60.el7_0.11.x86_64
 rpm -e --nodeps qemu-kvm-1.5.3-60.el7_0.11.x86_64
 rpm -ivh libcacard-rhev-1.5.3-60.el7.centos.11.x86_64.rpm
 rpm -ivh qemu-img-rhev-1.5.3-60.el7.centos.11.x86_64.rpm
 rpm -ivh qemu-kvm-common-rhev-1.5.3-60.el7.centos.11.x86_64.rpm
 rpm -ivh qemu-kvm-rhev-1.5.3-60.el7.centos.11.x86_64.rpm

Check for ceph support.

 qemu-img -h | grep rbd
 Supported formats: vvfat vpc vmdk vhdx vdi sheepdog sheepdog sheepdog rbd raw host_cdrom host_floppy host_device file qed qcow2 qcow parallels nbd nbd nbd iscsi gluster gluster gluster gluster dmg cow cloop bochs blkverify    blkdebug
 /usr/libexec/qemu-kvm --drive format=? | grep rbd
 Supported formats: vvfat vpc vmdk vhdx vdi sheepdog sheepdog sheepdog rbd raw host_cdrom host_floppy host_device file qed qcow2 qcow parallels nbd nbd nbd iscsi gluster gluster gluster gluster dmg cow cloop bochs blkverify blkdebug

Try to write image (for all nodes except kosmo-arch):

 qemu-img create -f rbd rbd:one/test-virtN 10G

where N node number.

Add ceph support for libvirt

On all nodes:

 systemctl enable messagebus.service
 systemctl start messagebus.service
 systemctl enable libvirtd.service
 systemctl start libvirtd.service

On kosmo-virt1 create uuid:

 uuidgen
 cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5

Create secret.xml

 
 cat > secret.xml <<EOF
 <secret ephemeral='no' private='no'>
 <uuid>cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5</uuid>
 <usage type='ceph'>
 <name>client.oneadmin AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q==</name>
 </usage>
 </secret>
 EOF

Where AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q== is cat /etc/ceph/oneadmin.key.
Copy secret.xml to other nodes.

Add key to libvirt (for all nodes except kosmo-arch)

 virsh secret-define --file secret.xml
 virsh secret-set-value --secret virsh secret-set-value --base64 $(cat /etc/ceph/oneadmin.key)

check

 virsh secret-list
 UUID                                 Usage
 -----------------------------------------------------------
 cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5 ceph client.oneadmin AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q==

Restart libvirtd:

 systemctl restart libvirtd.service

Convering database to mysql:

Downloading script:

 wget http://www.redmine.org/attachments/download/6239/sqlite3-to-mysql.py

Converting:

 sqlite3 /var/lib/one/one.db .dump | ./sqlite3-to-mysql.py > mysql.sql   
 mysql -u oneadmin -p opennebula < mysql.sql

Change /etc/one/oned.conf from

 DB = [ backend = "sqlite" ]

to

 DB = [ backend = "mysql",
      server  = "localhost",
      port    = 0,
      user    = "oneadmin",
      passwd  = "PASS",
      db_name = "opennebula" ]

Copy oned.conf to other nodes as root except kosmo-arch.

Check kosmo-virt2 and kosmo-virt3 nodes in turn:

   systemctl start opennebula opennebula-sunstone

check logs for errors (/var/log/one/oned.log /var/log/one/sched.log /var/log/one/sunstone.log)

   systemctl start opennebula opennebula-sunstone

Creating HA resources

On all nodes except kosmo-arch:

 systemctl disable opennebula opennebula-sunstone opennebula-novnc

From any of the nodes except kosmo-arch:

 crm
 configure
 primitive ClusterIP ocf:heartbeat:IPaddr2 params ip="192.168.14.41" cidr_netmask="24" op monitor interval="30s"
 primitive opennebula_p systemd:opennebula \
 op monitor interval=60s timeout=20s \
 op start interval="0" timeout="120s" \
 op stop  interval="0" timeout="120s" 
 primitive opennebula-sunstone_p systemd:opennebula-sunstone \
 op monitor interval=60s timeout=20s \
 op start interval="0" timeout="120s" \
 op stop  interval="0" timeout="120s" 
 primitive opennebula-novnc_p systemd:opennebula-novnc \
 op monitor interval=60s timeout=20s \
 op start interval="0" timeout="120s" \
 op stop  interval="0" timeout="120s" 
 group Opennebula_HA ClusterIP opennebula_p opennebula-sunstone_p  opennebula-novnc_p
 exit

Check

 crm status
 Last updated: Tue Mar 31 16:43:00 2015
 Last change: Tue Mar 31 16:40:22 2015 via cibadmin on kosmo-virt1
 Stack: corosync
 Current DC: kosmo-virt2 (2) - partition with quorum
 Version: 1.1.10-32.el7_0.1-368c726
 3 Nodes configured
 4 Resources configured
 Online: [ kosmo-virt1 kosmo-virt2 kosmo-virt3 ]
 Resource Group: Opennebula_HA
   ClusterIP  (ocf::heartbeat:IPaddr2):       Started kosmo-virt1
   opennebula_p       (systemd:opennebula):   Started kosmo-virt1
   opennebula-sunstone_p      (systemd:opennebula-sunstone):  Started kosmo-virt1
   opennebula-novnc_p (systemd:opennebula-novnc):     Started kosmo-virt1

Configuring OpenNebula

http://active_node:9869 – web management.

With web management. 1. Create Cluster. 2. Add hosts (using 192.168.14.0 networks).

Console management.

3. Add net. (su oneadmin)

 
 cat << EOT > def.net
 NAME    = "Shared LAN"
 TYPE    = RANGED
 # Now we'll use the host private network (physical)
 BRIDGE  = nab0
 NETWORK_SIZE    = C
 NETWORK_ADDRESS = 192.168.14.0
 EOT
 onevnet create def.net

4. Create image rbd datastore. (su oneadmin)

 cat << EOT > rbd.conf
 NAME = "cephds"
 DS_MAD = ceph
 TM_MAD = ceph
 DISK_TYPE = RBD
 POOL_NAME = one
 BRIDGE_LIST ="192.168.14.42 192.168.14.43 192.168.14.44"
 CEPH_HOST ="172.19.254.1:6789 172.19.254.2:6789 172.19.254.3:6789"
 CEPH_SECRET ="cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5" #uuid key, looked at libvirt authentication for ceph
 CEPH_USER = oneadmin
 onedatastore create rbd.conf

5. Create system ceph datastore.

check last id number – N.

onedatastore list

on all nodes create directory and mount ceph

mkdir /var/lib/one/datastores/N+1
echo "172.19.254.K:6789:/ /var/lib/one/datastores/N+1 ceph rw,relatime,name=admin,secret=AQB4jxJV8PuhJhAAdsdsdRBkSFrtr0VvnQNljBw==,nodcache 0 0 # see secret in /etc/ceph/ceph.client.admin.keyring" >> /etc/fstab
mount /var/lib/one/datastores/N+1

where K= IP of curent node.

From one node change permitions:

chown oneadmin:oneadmin /var/lib/one/datastores/N+1

Create system ceph datastore (su oneadmin):

 cat << EOT > sys_fs.conf
 NAME    = system_ceph
 TM_MAD  = shared
 TYPE    = SYSTEM_DS
 EOT

 onedatastore create sys_fs.conf

6. Add nodes, vnets, datastories to created cluster with web management.

HA VM

Here is official doc.
But one comment. I’m using migrate instead of recreate command.

 /etc/one/oned.conf
 HOST_HOOK = [
  name      = "error",
  on        = "ERROR",
  command   = "host_error.rb",
  arguments = "$HID -m",
  remote    = no ]

BACKUP

Some words about backup.

Use persistent image type for this work scheme.

For BACKUP was used a single Linux server kosmo-arch (ceph client) with installed zfs on linux. For zpool set ZFS and deduplication on. (Remember that deduplication required about 2GB mem for 1TB storage space.)

Example of simple script that is starting by cron:

#!/bin/sh
currdate=`/bin/date +%Y-%m-%0e`
olddate=`/bin/date --date="60 days ago" +%Y-%m-%0e`
imagelist="one-21" #space delimited list
for i in $imagelist
do
snapcurchk=`/usr/bin/rbd -p one ls | grep $i | grep $currdate`
snapoldchk=`/usr/bin/rbd -p one ls | grep $i | grep $currdate`
if test -z "$snapcurchk"
 then
  /usr/bin/rbd snap create --snap $currdate one/$i
  /usr/bin/rbd export one/$i@$currdate /rbdback/$i-$currdate
 else
  echo "current snapshot exist" 
fi
if test -z "$snapoldchk"
  then
   echo "old snapshot doesn't exist"
  else
  /usr/bin/rbd snap rm one/$i@$olddate
  /bin/rm -f /rbdback/$i-$olddate
 fi
done


Use onevm utility or web-interface (see template) to know which image assigned to VM.

onevm list
onevm show "VM_ID" -a | grep IMAGE_ID

PS

Don’t forget to change storage driver for VM to vda.(Drivers for windows). Without that you will face with low IO performance. (no more than 100 MB/s).
I saw 415MB/s with virtio drivers.

Links.

The OpenNebula team is proud to announce a new maintenance release of OpenNebula 4.12.1 Cotton Candy. This release comes with several bug fixes found after the 4.12 release. These bug fixes covers different OpenNebula components, like for instance the scheduler, the Cloud View self service portal, Sunstone web interface, OpenNebula Core and several drivers (VM, Auth, Network). Check the full list of bug fixes in the development portal.

Besides the bug fixes mentioned above, 4.12.1 includes several improvements:

If you haven’t had the chance so far to try OpenNebula 4.12, now is the time to download and install OpenNebula 4.12.1 Cotton Candy. As as highlight, find below the newly showback feature, which enables the generation of cost reports that can be integrated with chargeback and billing platforms: