The Ceph Datastore 4.4
The Ceph datastore driver provides OpenNebula users with the possibility of using Ceph block devices as their Virtual Images.
The hosts where Ceph datastores based images will be deployed must be part of a running Ceph cluster. To do so refer to the Ceph documentation.
The Ceph cluster must be configured in such a way that no specific authentication is required, which means that for cephx
authentication the keyring must be in the expected path so that rbd
and ceph
commands work without specifying explicitely the keyring's location.
Also the mon
daemon must be defined in the ceph.conf for all the nodes, so hostname
and port
doesn't need to be specified explicitely in any Ceph command.
Additionally each OpenNebula datastore is backed by a ceph pool, these pools must be created and configured in the Ceph cluster. The name of the pool by default is one
but can be changed on a per-datastore basis (see below).
This driver requires the system administrator to specify one or several Ceph frontends (which need to be nodes in the Ceph cluster) where many of the datastores storage actions will take place. For instance, when creating an image, OpenNebula will choose one of the listed Ceph frontends (using a round-robin algorithm) and transfer the image to that node and run qemu-img convert -O rbd
. These nodes need to be specified in the BRIDGE_LIST
section.
Note that this Ceph frontend can be any node in the OpenNebula setup: the OpenNebula frontend, any worker node, or a specific node (recommended).
All the nodes listed in the BRIDGE_LIST
variable must haveqemu-img
installed.
There are no specific requirements for the host, besides being libvirt/kvm nodes, since xen is not (yet) supported for the Ceph drivers.
To use ceph drivers, the system datastore will work both with shared
or as ssh
. This sytem datastore will hold only the symbolic links to the block devices, so it will not take much space. See more details on the System Datastore Guide
It will also be used to hold context images and Disks created on the fly, they will be created as regular files.
The first step to create a Ceph datastore is to set up a template file for it. In the following table you can see the supported configuration attributes. The datastore type is set by its drivers, in this case be sure to add DS_MAD=ceph
and TM_MAD=ceph
for the transfer mechanism, see below.
Attribute | Description |
---|---|
NAME | The name of the datastore |
DS_MAD | The DS type, use ceph for the Ceph datastore |
TM_MAD | Transfer drivers for the datastore, use ceph , see below |
DISK_TYPE | The type must be RBD |
BRIDGE_LIST | Mandatory space separated list of Ceph servers that are going to be used as frontends. |
POOL_NAME | The OpenNebula Ceph pool name. Defaults to one . This pool must exist before using the drivers. |
STAGING_DIR | Default path for image operations in the OpenNebula Ceph frontend. |
RESTRICTED_DIRS | Paths that can not be used to register images. A space separated list of paths. |
SAFE_DIRS | If you need to un-block a directory under one of the RESTRICTED_DIRS. A space separated list of paths. |
NO_DECOMPRESS | Do not try to untar or decompress the file to be registered. Useful for specialized Transfer Managers |
LIMIT_TRANSFER_BW | Specify the maximum transfer rate in bytes/second when downloading images from a http/https URL. Suffixes K, M or G can be used. |
DATASTORE_CAPACITY_CHECK | If “yes”, the available capacity of the datastore is checked before creating a new image |
CEPH_HOST | Space-separated list of Ceph monitors. Example: “host1 host2:port2 host3 host4:port4” (if no port is specified, the default one is chosen). Required for Libvirt 1.x when cephx is enabled . |
CEPH_SECRET | A generated UUID for a LibVirt secret (to hold the CephX authentication key in Libvirt on each hypervisor). This should be generated when creating the Ceph datastore in OpenNebula. Required for Libvirt 1.x when cephx is enabled . |
For example, the following examples illustrates the creation of an Ceph datastore using a configuration file. In this case we will use the host cephfrontend
as one the OpenNebula Ceph frontend
The one
pool must already exist, if it doesn't create it with:
<xterm>
ceph osd pool create one 128
ceph osd lspools
0 data,1 metadata,2 rbd,6 one, </xterm>
An example of datastore: <xterm>
cat ds.conf
NAME = “cephds” DS_MAD = ceph TM_MAD = ceph
# the following line *must* be preset DISK_TYPE = RBD
POOL_NAME = one BRIDGE_LIST = cephfrontend
onedatastore create ds.conf
ID: 101
onedatastore listID NAME CLUSTER IMAGES TYPE TM 0 system none 0 fs shared 1 default none 3 fs shared100 cephds none 0 ceph ceph
</xterm>
The DS and TM MAD can be changed later using the onedatastore update
command. You can check more details of the datastore by issuing the onedatastore show
command.
If Cephx is enabled, there are some special considerations the OpenNebula administrator must take into account.
Create a Ceph user for the OpenNebula hosts. We will use the name client.libvirt
, but any other name is fine. Create the user in Ceph and grant it rwx permissions on the one
pool:
<xterm> ceph auth get-or-create client.libvirt mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=one' </xterm>
Extract the client.libvirt
key, save it to a file named client.libvirt.key
and distribute it to all the KVM hosts:
<xterm> sudo ceph auth list # save client.libvirt's key to client.libvirt.key </xterm>
Generate a UUID, for example running uuigden
(the generated uuid will referenced as %UUID%
from now onwards).
Create a file named secret.xml
(using the genereated %UUID%
and distribute it to all the KVM hosts:
<xterm>
cat > secret.xml «EOF
<secret ephemeral='no' private='no'>
<uuid>%UUID%</uuid> <usage type='ceph'> <name>client.libvirt secret</name> </usage>
</secret> EOF </xterm>
The following commands must be executed in all the KVM hosts as oneadmin (assuming the secret.xml
and client.libvirt.key
files have been distributed to the hosts):
<xterm> # Replace %UUID% with the value generated in the previous step virsh secret-set-value –secret %UUID% –base64 $(cat client.libvirt.key) </xterm>
Finally, the Ceph datastore must be updated to add the following values:
CEPH_USER="libvirt" CEPH_SECRET="%UUID%" CEPH_HOST="<list of ceph mon hosts, see table above>"
You can read more information about this in the Ceph guide Using libvirt with Ceph.
The workflow for Ceph images is similar to the other datastores, which means that a user will create an image inside the Ceph datastores by providing a path to the image file locally available in the OpenNebula frontend, or to an http url, and the driver will convert it to a Ceph block device.
All the usual operations are avalaible: oneimage create, oneimage delete, oneimage clone, oneimage persistent, oneimage nonpersistent, onevm disk-snapshot, etc…
System administrators and integrators are encouraged to modify these drivers in order to integrate them with their datacenter:
Under /var/lib/one/remotes/
:
disk-snapshot
operation.
Another option would be to manually patch the post and pre-migrate scripts for the ssh system datastore to scp
the files residing in the system datastore before the live-migration. Read more.