Atom feed of this document
 

 Ceph RADOS block device (RBD)

By Sebastien Han from http://www.sebastien-han.fr/blog/2012/06/10/introducing-ceph-to-openstack/

If you are using KVM or QEMU as your hypervisor, the Compute service can be configured to use Ceph's RADOS block devices (RBD) for volumes.

Ceph is a massively scalable, open source, distributed storage system. It is comprised of an object store, block store, and a POSIX-compliant distributed file system. The platform is capable of auto-scaling to the exabyte level and beyond, it runs on commodity hardware, it is self-healing and self-managing, and has no single point of failure. Ceph is in the Linux kernel and is integrated with the OpenStack™ cloud operating system. As a result of its open source nature, this portable storage platform may be installed and used in public or private clouds.

 

Figure 11.1. Ceph-architecture.png


 RADOS?

You can easily get confused by the denomination: Ceph? RADOS?

RADOS: Reliable Autonomic Distributed Object Store is an object storage. RADOS takes care of distributing the objects across the whole storage cluster and replicating them for fault tolerance. It is built with 3 major components:

  • Object Storage Device (ODS): the storage daemon - RADOS service, the location of your data. You must have this daemon running on each server of your cluster. For each OSD you can have an associated hard drive disks. For performance purpose it’s usually better to pool your hard drive disk with raid arrays, LVM or btrfs pooling. With that, for one server your will have one daemon running. By default, three pools are created: data, metadata and RBD.

  • Meta-Data Server (MDS): this is where the metadata are stored. MDSs builds POSIX file system on top of objects for Ceph clients. However if you are not using the Ceph File System, you do not need a meta data server.

  • Monitor (MON): this lightweight daemon handles all the communications with the external applications and the clients. It also provides a consensus for distributed decision making in a Ceph/RADOS cluster. For instance when you mount a Ceph shared on a client you point to the address of a MON server. It checks the state and the consistency of the data. In an ideal setup you will at least run 3 ceph-mon daemons on separate servers. Quorum decisions and calculus are elected by a majority vote, we expressly need odd number.

Ceph developers recommend to use btrfs as a file system for the storage. Using XFS is also possible and might be a better alternative for production environments. Neither Ceph nor Btrfs are ready for production. It could be really risky to put them together. This is why XFS is an excellent alternative to btrfs. The ext4 file system is also compatible but doesn’t take advantage of all the power of Ceph.

[Note]Note

We recommend configuring Ceph to use the XFS file system in the near term, and btrfs in the long term once it is stable enough for production.

See ceph.com/docs/master/rec/file system/ for more information about usable file systems.

 Ways to store, use and expose data

There are several ways to store and access your data.

  • RADOS: as an object, default storage mechanism.

  • RBD: as a block device. The Linux kernel RBD (rados block device) driver allows striping a Linux block device over multiple distributed object store data objects. It is compatible with the kvm RBD image.

  • CephFS: as a file, POSIX-compliant file system.

Ceph exposes its distributed object store (RADOS) and it can be accessed via multiple interfaces:

  • RADOS Gateway: Swift and Amazon-S3 compatible RESTful interface. See RADOS_Gateway for further information.

  • librados and the related C/C++ bindings.

  • rbd and QEMU-RBD: Linux kernel and QEMU block devices that stripe data across multiple objects.

For detailed installation instructions and benchmarking information, see http://www.sebastien-han.fr/blog/2012/06/10/introducing-ceph-to-openstack/.


loading table of contents...