Preparing Bricks for Glusterfs

schaffung
Dev Genius
Published in
8 min readJun 18, 2021

--

In this story, I would be going over how to create a xfs brick to be used by your volume.

Photo by Halacious on Unsplash : Well not the ones of physical nature…more of a logical nature!!!

Now, I have a 10 GB disk added to my VM, which I would be using for this purpose.

[root@dhcp41-206 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 20G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 19G 0 part
├─fedora_dhcp41--206-root 253:0 0 17G 0 lvm /
└─fedora_dhcp41--206-swap 253:1 0 2G 0 lvm [SWAP]
sdb 8:16 0 10G 0 disk
sr0 11:0 1 1024M 0 rom

So, for me that is vdb. One can use the lsblk command to find if they have a disk other than the primary one. ( Don’t try the steps on the primary disk!! Add a second disk and then proceed with the following steps. )

What are we trying to do ?

Before going ahead and following the steps, why not understand what we are trying to achieve here? Or rather if you just wanna follow the steps, please skip to the next section.

So, we are going to create thinly Provisioned Logical Volume. That requires creating a physical volume, followed by a volume group, then the LV and finally format the disks in xfs configuration. This means a lot is going on here, and it will be more interesting and meaningful if only one knew what we are actually doing here.

Let’s begin with a Disk. Now, when you hear disk, imagine those spinning magnetic disks which contain data or rather the ability to store data.

Photo by Denny Müller on Unsplash

The above is generally termed as HDD ( Hard Disk Drive ). Sometimes there seems to be a confusion when people say Hard Disk and Hard Drive. Well, technically for any device to be run and to communicate with especially devices like disks which just contain magnetically aligned atoms, you need some intermediary to translate the data, aka the drive ( for those who know basic electrical, think Motor drive).

And so we have a Hard Disk, and it’s driver. The complete unit called as a Hard Disk Drive and we’ve morphed those terms now. ( As it usually happens with all terms ).

Back to the discussion, these are not the only storage medium out there. We also have solid state devices (SSDs) and flash memory.

But for our understanding, let’s just continue with the disk storage.

A layer above, we have the partitions. Partitions are created on the physical disk so that multiple OSes can be used ( For example nix for work and windows for gaming..) and each of these partitions could have their own file system.

A layer above we have the Physical volumes which are created over the partition. Now a physical volume as the name states will be restricted on a single physical device, but what if someone wants their file system to span multiple storage devices?

That is where we have a logical volume. A logical volume group can be created which would contain multiple physical volumes. Now, a layer above, we finally reach the logical volume.

Now, these volumes can be thin provisioned or thick. A thin provisioned LV won’t be allocated all the said memory it was said it would contain. On paper, it does have the said memory allocated but in actuality, as the memory demand increases, it is allocated memory. Thick volumes are the opposite. If I say X GB of Thick Volume, it will come allocated with X GB and the user can expect to find it exactly the same.

Also, in case of Thin volumes it might also happen that one can over-allocate. Like the airline ticketing system which allows booking of tickets above the actual capacity of a flight because they know somebody would cancel eventually ( Yes…they do have algorithm to determine the optimum number of over-bookings they have to allow).

The below picture can give help you in understanding whatever I had mentioned before on the PV, LV and VG.

Memory Layout

Now that we have a picture of what we are trying to do, let’s begin with the steps.

The actual steps

Now, let’s dive in to the steps directly.

What we’re going to do is,

  1. Create a physical volume
  2. Creation of a volume group using the physical volume.
  3. Create a thinpool and then a LV out of this thinpool.
  4. Create a xfs file system on the LV and then mount it.

On using lsblk command, I get that my disk is /dev/sdb ( your’s might be named something else…please check it and use that particular disk and not any other disk!!)

[root@dhcp41-206 ~]# pvcreate /dev/sd

On running the command pvs,

[root@dhcp41-206 ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 fedora_dhcp41-206 lvm2 a-- <19.00g 0
/dev/sdb lvm2 --- 10.00g 10.00g

We created a physical volume, next step is to create a volume group using this physical volume.

[root@dhcp41-206 ~]# vgcreate vg_gluster /dev/sdb
Volume group "vg_gluster" successfully created

The pvs now shows

[root@dhcp41-206 ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 fedora_dhcp41-206 lvm2 a-- <19.00g 0
/dev/sdb vg_gluster lvm2 a-- <10.00g <10.00g

With a volume group ready, we can create a thinpool using this volume.

[root@dhcp41-206 ~]# lvcreate -L 9G --thinpool gluster_thin_pool vg_gluster
Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data.
Logical volume "gluster_thin_pool" created.

Using lvdisplay, we can see that

--- Logical volume ---
LV Name gluster_thin_pool
VG Name vg_gluster
LV UUID n9Cr5D-acVx-IoNY-xsnj-zsUK-3qPz-0RqhI5
LV Write Access read/write
LV Creation host, time dhcp41-206.lab.eng.blr.redhat.com, 2021-06-18 20:52:21 +0530
LV Pool metadata gluster_thin_pool_tmeta
LV Pool data gluster_thin_pool_tdata
LV Status available
# open 0
LV Size 9.00 GiB
Allocated pool data 0.00%
Allocated metadata 10.58%
Current LE 2304
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:4

Now this is the thinpool. We can also create LV for data and LV for metadata from the volume group and then use lvconvert to create a thinpool using them. But herein, just for demonstration purposes, I directly used the lvcreate to create a thinpool.

Now, we can go ahead and create a thin LV from this thinpool,

[root@dhcp41-206 ~]# lvcreate -V 9G -T vg_gluster/gluster_thin_pool -n lv_gluster
Logical volume "lv_gluster" created.

And with lvdisplay, we can see the path of this logical volume

[root@dhcp41-206 ~]# lvdisplay
--- Logical volume ---
LV Name gluster_thin_pool
VG Name vg_gluster
LV UUID n9Cr5D-acVx-IoNY-xsnj-zsUK-3qPz-0RqhI5
LV Write Access read/write (activated read only)
LV Creation host, time dhcp41-206.lab.eng.blr.redhat.com, 2021-06-18 20:52:21 +0530
LV Pool metadata gluster_thin_pool_tmeta
LV Pool data gluster_thin_pool_tdata
LV Status available
# open 2
LV Size 9.00 GiB
Allocated pool data 0.00%
Allocated metadata 10.61%
Current LE 2304
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:4

--- Logical volume ---
LV Path /dev/vg_gluster/lv_gluster
LV Name lv_gluster
VG Name vg_gluster
LV UUID aXWfWo-uRYO-PVXo-dg15-juyF-EQ7n-kv5NWL
LV Write Access read/write
LV Creation host, time dhcp41-206.lab.eng.blr.redhat.com, 2021-06-18 21:05:26 +0530
LV Pool name gluster_thin_pool
LV Status available
# open 0
LV Size 9.00 GiB
Mapped size 0.00%
Current LE 2304
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:6

The next step would be to creating xfs filesystem on this logical volume. For the command mkfs.xfs to work, we need to first install the package xfsprogs ( Herein, I’m using fedora. The package name might vary with a different distro ).

Once the package has been installed, we can create the XFS filesystem.

[root@dhcp41-206 ~]# mkfs.xfs /dev/vg_gluster/lv_gluster 
meta-data=/dev/vg_gluster/lv_gluster isize=512 agcount=16, agsize=147456 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1
data = bsize=4096 blocks=2359296, imaxpct=25
= sunit=16 swidth=16 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=16 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Discarding blocks...Done.

Again, a note here is the stripe input wasn’t given by me as I’m delegating that decision to the command being invoked, but people can also provide the strips and units per stripe.

Now, we have the basic backbone for the brick ready. We just need to mount this device by adding an entry into fstab.

The entry I have added being,

/dev/vg_gluster/lv_gluster /glusterfs/brick   xfs rw,inode64,noatime,nouuid 1 2

What does this mean…

  1. Field 1: The device
  2. Field 2: The mounting directory. I had created a directory called /glusterfs/brick for mounting this device.
  3. Field 3: Filesystem Type. In our case it is xfs
  4. Field 4: Options : rw for read-write access, inode64 means that the inodes are to be represented by 64 bits. We can actually use inode32, but it’ll then hamper the operations if the brick is big and a lot of inode entries are being added which will cause…no new inode assignable as all has been used, leading to “no free space” issue. Hence inode64. The next option being noatime, it eliminates the need by the system to make writes to the file system for files which are simply being read. This can lead to some performance improvement as we are saving the write being done to the device. The final option we give is nouuid, which disables the check for double mounted filesystem using filesystem uuid. This is useful to mount LVM snapshot volumes.
  5. Field 5: Filesystem dump. Specifies the option that needs to be used by the dump program. A non zero value implies the filesystem will be backed up.
  6. Field 6: fsck. Non zero value implying check the fsck. The root partition will have this as 1, I’ve marked this as 2 implying second in line to be run for fsck.

After the options have been added for our filesystem, we just need to run mount -a for the changes to be worked upon. After that, we can use the df utility to see that

[root@dhcp41-206 ~]# df -Th
Filesystem Type Size Used Avail Use% Mounted on
devtmpfs devtmpfs 942M 0 942M 0% /dev
tmpfs tmpfs 955M 0 955M 0% /dev/shm
tmpfs tmpfs 955M 792K 954M 1% /run
/dev/mapper/fedora_dhcp41--206-root ext4 17G 1.5G 15G 10% /
tmpfs tmpfs 955M 4.0K 955M 1% /tmp
/dev/sda1 ext4 976M 141M 769M 16% /boot
tmpfs tmpfs 191M 0 191M 0% /run/user/0
/dev/mapper/vg_gluster-lv_gluster xfs 9.0G 98M 8.9G 2% /glusterfs/brick

Now this brick is ready to be used for creating glusterfs volumes.

--

--