The Ceph Datastore 4.0

Overview

The Ceph datastore driver provides OpenNebula users with the possibility of using Ceph block devices as their Virtual Images.

:!: This driver only works with libvirt/KVM drivers. Xen is not (yet) supported.

:!: This driver requires that the OpenNebula nodes using the Ceph driver must be part of a running Ceph cluster. More information in Ceph documentation.

:!: The hypervisor nodes need to be part of a working Ceph cluster and the Libvirt and QEMU packages need to be recent enough to have support for Ceph. For Ubuntu systems this is available out of the box, however for CentOS systems you will need to manually install this version of qemu-kvm.

The Ceph datastore should be used always with the shared system datastore, in order to provide support for live migrations.

Another option would be to manually patch the post and pre-migrate scripts for the ssh system datastore to scp the files residing in the system datastore before the live-migration. Read more.

inlinetoc

Requirements

Ceph cluster Configuration

The hosts where Ceph datastores based images will be deployed must be part of a running Ceph cluster. To do so refer to the Ceph documentation.

The Ceph cluster must be configured in such a way that no specific authentication is required, which means that for cephx authentication the keyring must be in the expected path so that rbd and ceph commands work without specifying explicitely the keyring's location.

Also the mon daemon must be defined in the ceph.conf for all the nodes, so hostname and port doesn't need to be specified explicitely in any Ceph command.

Additionally each OpenNebula datastore is backed by a ceph pool, these pools must be created and configured in the Ceph cluster. The name of the pool by default is one but can be changed on a per-datastore basis (see below).

OpenNebula Ceph frontend

This driver requires the system administrator to choose an OpenNebula Ceph frontend (which needs to be a node in the Ceph cluster) where many of the datastores storage actions will take place. For instance, when creating an image, OpenNebula will transfer the image to this Ceph frontend and run qemu-img convert -O rbd in that node.

Note that this Ceph frontend can be any node in the OpenNebula setup: the OpenNebula frontend, any worker node, or a specific node (recommended). The default value is localhost therefore it will take place in the OpenNebula frontend, administrators are encouraged to change this by adding HOST=… in their Ceph datastore template.

This node must have qemu-img installed.

OpenNebula Hosts

There are no specific requirements for the host, besides being libvirt/kvm nodes, since xen is not (yet) supported for the Ceph drivers.

Configuration

Configuring the System Datastore

To use ceph drivers, the system datastore will work both with shared or as ssh. This sytem datastore will hold only the symbolic links to the block devices, so it will not take much space. See more details on the System Datastore Guide

It will also be used to hold context images and Disks created on the fly, they will be created as regular files.

Configuring Ceph Datastores

The first step to create a Ceph datastore is to set up a template file for it. In the following table you can see the supported configuration attributes. The datastore type is set by its drivers, in this case be sure to add DS_MAD=ceph and TM_MAD=ceph for the transfer mechanism, see below.

Attribute Description
NAME The name of the datastore
DS_MAD The DS type, use ceph for the Ceph datastore
TM_MAD Transfer drivers for the datastore, use ceph, see below
DISK_TYPE The type must be RBD
HOST Any OpenNebula Ceph frontend. Defaults to localhost
POOL_NAME The OpenNebula Ceph pool name. Defaults to one. This pool must exist before using the drivers.
STAGING_DIR Default path for image operations in the OpenNebula Ceph frontend.
RESTRICTED_DIRS Paths that can not be used to register images. A space separated list of paths. :!:
SAFE_DIRS If you need to un-block a directory under one of the RESTRICTED_DIRS. A space separated list of paths.
NO_DECOMPRESS Do not try to untar or decompress the file to be registered. Useful for specialized Transfer Managers
LIMIT_TRANSFER_BW Specify the maximum transfer rate in bytes/second when downloading images from a http/https URL. Suffixes K, M or G can be used.

:!: This will prevent users registering important files as VM images and accessing them through their VMs. OpenNebula will automatically add its configuration directories: /var/lib/one, /etc/one and oneadmin's home. If users try to register an image from a restricted directory, they will get the following error message: “Not allowed to copy image file”.

For example, the following examples illustrates the creation of an Ceph datastore using a configuration file. In this case we will use the host cephfrontend as one the OpenNebula Ceph frontend

<xterm> # The 'one' pool must exist

ceph osd lspools

0 data,1 metadata,2 rbd,6 one,

cat ds.conf

NAME = “cephds” DS_MAD = ceph TM_MAD = ceph

# the following line *must* be preset DISK_TYPE = RBD

POOL_NAME = one HOST = cephfrontend

onedatastore create ds.conf

ID: 101

onedatastore list
ID NAME            CLUSTER  IMAGES TYPE   TM    
 0 system          none     0      fs     shared
 1 default         none     3      fs     shared

100 cephds none 0 ceph ceph

</xterm>

You can check more details of the datastore by issuing the onedatastore show command.

:!: Note that datastores are not associated to any cluster by default, and they are supposed to be accessible by every single host. If you need to configure datastores for just a subset of the hosts take a look to the Cluster guide.

Using the Ceph transfer driver

The workflow for Ceph images is similar to the other datastores, which means that a user will create an image inside the Ceph datastores by providing a path to the image file locally available in the OpenNebula frontend, or to an http url, and the driver will convert it to a Ceph block device.

All the usual operations are avalaible: oneimage create, oneimage delete, oneimage clone, oneimage persistent, oneimage nonpersistent, onevm disk-snapshot, etc…

Tuning & Extending

System administrators and integrators are encouraged to modify these drivers in order to integrate them with their datacenter:

Under /var/lib/one/remotes/:

  • datastore/ceph/ceph.conf: Default values for ceph parameters
    • HOST: Default OpenNebula Ceph frontend
    • POOL_NAME: Default volume group
    • STAGING_DIR: Default path for image operations in the OpenNebula Ceph frontend.
  • datastore/ceph/cp: Registers a new image. Creates a new logical volume in ceph.
  • datastore/ceph/mkfs: Makes a new empty image. Creates a new logical volume in ceph.
  • datastore/ceph/rm: Removes the ceph logical volume.
  • tm/ceph/ln: Does nothing since it's handled by libvirt.
  • tm/ceph/clone: Copies the image to a new image.
  • tm/ceph/mvds: Saves the image in a Ceph block device for SAVE_AS.
  • tm/ceph/delete: Removes a non-persistent image from the Virtual Machine directory if it hasn't been subject to a disk-snapshot operation.