LVM, Demystified
I've been a sysadmin for a long time, and part of being a sysadmin is doing more than is humanly possible. Sometimes that means writing wicked cool scripts, sometimes it means working late, and sometimes it means learning to say no. Unfortunately, it also sometimes means cutting corners. I confess, I've been "that guy" more than once. A good example is SELinux. On more than a few (hundred!) occasions, I've simply disabled SELinux, because getting things to work right is often really frustrating and time consuming. The same is true with LVM (Logical Volume Manager). I didn't get it. I thought it added an unnecessary layer of complexity. I thought it meant another potential point of failure. I thought it was stupid.
I was wrong.
LVM is an incredibly flexible, ridiculously useful and not terribly complicated to use system. It makes life easier. It makes future storage upgrades and migrations simple. Quite simply, I love it. So in this article, I cover the concepts and usage of LVM. By the time I'm done, hopefully you'll love it as much as I do!
What LVM Is
The best analogy I can come up with for explaining LVM is a SAN. If you've ever used a SAN (Storage Area Network) in your server environment, you know it abstracts the idea of individual hard drives and allows you to carve out "chunks" of space to use as drives. Rather than worrying about how big your hard drives might be, a SAN lets you throw all your hard drives into a big chassis and then allocate space to individual clients without being concerned about how many or how few physical drives are being used. LVM is sort of like that, but for an individual system rather than an entire network.
Figure 1 shows my poor attempt at drawing the concept of an LVM system. At first glance, it might seem like using an LVM is silly. Why combine a bunch of drives together, only to carve them up into virtual drives, right? Thankfully, that simple concept gives incredible flexibility down the road. Need a great big partition, but have only a bunch of smaller disks? No problem. Have only a couple disks now, but want to add more later without reformatting? No problem. Need to take snapshots, like with virtual servers, but you're using actual bare metal? No problem. LVM makes dealing with storage far better than partitioning drives or using a simple RAID setup (which, incidentally, brings me to the next issue).
Figure 1. It's important to think of my drawings as art—possibly second-grade art.
What LVM Isn't
With all the flexibility and expandability I mentioned in the previous paragraph, it seems like LVM would be a perfect replacement for hardware- or software-based RAID. After all, one of the big advantages of RAID is that multiple smaller drives can be used as a single, larger drive. For that particular feature, LVM is indeed ideal. Unfortunately, however, LVM doesn't provide any options for redundancy or parity. That means if you have a drive fail in LVM, you lose data. There's no such thing as striped LVM or mirrored LVM; it's simply not designed to do that.
LVM also isn't designed to increase speed by striping reads and writes across multiple disks. As block devices in the volume group fill up, such simultaneous read/writes may occur, but it's not by design and certainly not to gain speed. Hopefully, it's clear: LVM is really cool, but it is not in any way a replacement for RAID. Thankfully, it doesn't need to be.
(Note: recent versions of LVM actually do provide striping and mirroring features. In some cases, it can take the place of RAID completely. I still think understanding them as separate concepts is important. If you want to learn more about LVM and utilize the RAID features, I'll leave that as an exercise for you.)
Two Peas in a Pod
If you look at the first "stage" in my drawing (Figure 1), you'll notice that I didn't call the 10GB chunks "drives"; I called them physical volumes. That's because although it's certainly possible to use a physical drive as a physical volume in LVM, it's not a requirement. In fact, it's not even the most common scenario. In most production environments, LVM is used in combination with RAID. Whether that's hardware-based RAID or software-based RAID, having your underlying physical volumes exist as RAID devices is ideal.
As someone who has had problems with hardware-based RAID arrays, I tend to lean toward software-based RAID in my systems. That's certainly a matter of personal preference, but it's good to know that since software-based RAID and LVM both operate at the kernel level, both are extremely efficient. Software-based RAID admittedly uses some CPU, especially when rebuilding arrays, but LVM uses very little. If I/O performance is of utmost importance for your purposes, it's worth doing some research and possibly testing before committing to any solution.
Getting Started
Although it's certainly possible to transition to an LVM system after
Linux is already installed, it's far more preferable to do so during
the initial setup. Most distributions allow for LVM setup to take place
during the installation process, and in the case of CentOS and RHEL,
LVM is used by default. Even if you're installing only onto a single,
non-RAID hard drive, setting up LVM allows you flexibility and expansion
opportunity later. Heck, it's possible to add RAID to a server later on,
then simply migrate the data from your original physical volume to the
RAID physical volume. That's far easier than using dd
, especially when
you'd like to keep your server running!
Because this is an introduction, let me start with a simplistic setup. Let's say you have two hard drives, /dev/sdb and /dev/sdc. With LVM, any block device can be used as a physical volume (PV), which means you can use either partitions or entire drives. If you need to have a "traditional" partition (in some cases, the /boot partition might need to be on a regular, non-LVM device), be sure to partition the drive before adding the physical volumes to your volume group. In this example, let's use the raw disks themselves.
Step 1: Create Physical Volumes
Once you have the block devices you want to add to your volume group
(again, keep referring to my drawing if the terms get confusing), you
need to establish them as LVM physical volumes. To do that, use the
pvcreate
command:
pvcreate /dev/sdb
pvcreate /dev/sdc
These commands configure the drives as potential candidates to be added
to a volume group. If you want to make sure it worked correctly, you
can type pvdisplay
or pvscan
to show the status of any existing LVM
Physical Volumes:
$ sudo pvdisplay
--- Physical volume ---
PV Name /dev/sdb
VG Name
PV Size 10.4 GiB / not usable 3.00 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 4994
Free PE 4994
Allocated PE 0
PV UUID SRKAXh-EpYr-r2td-g0gA-31RA-fnfz-3qqGrO
--- Physical volume ---
PV Name /dev/sdc
VG Name
PV Size 10.4 GiB / not usable 3.00 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 4994
Free PE 4994
Allocated PE 0
PV UUID t2cKru-IwMy-I8re-ADp2-vzFF-Tvh5-O4zMhI
And, the simpler pvscan
:
$ sudo pvscan
PV /dev/sdb lvm2 [10.4 GiB]
PV /dev/sdc lvm2 [10.4 GiB]
Total: 2 [20.8 GiB] / in use: 0 [0 ] / in no VG: 2 [20.8 GiB]
Once you create the volume group and logical volumes, go ahead and run these commands again to see how the information changes. The differences should be obvious and should make sense.
Step 2: the Volume Group
You don't currently have any volume groups, so create one using the two physical volumes you just made:
vgcreate my_volume_group /dev/sdb /dev/sdc
Hopefully the command is clear. You've created a volume group named
my_volume_group using the physical volumes /dev/sdb and /dev/sdc. As
with the physical volumes, if you want to check the current state
of LVM Volume Groups on your system, type vgdisplay
to get a listing:
$ sudo vgdisplay
--- Volume group ---
VG Name my_volume_group
System ID
Format lvm2
Metadata Areas 2
Metadata Sequence No 1
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 0
Open LV 0
Max PV 0
Cur PV 2
Act PV 2
VG Size 20.8 GiB
PE Size 4.00 MiB
Total PE 9988
Alloc PE / Size 0 / 0 GiB
Free PE / Size 9988 / 20.8 GiB
VG UUID oVYiY6-bQp9-4CVO-QgrN-LGgB-1umR-ebJQo4
As you can see in the output, you've combined the available space of the two physical volumes (10.4GB each) into a total pool of 20.8GB. You could add more drives to the volume group or mix and match entire drives with partitions from other drives. LVM is very flexible. The large pool of available data does no good, however, until you create Logical Volumes to act as your usable disks.
Step 3: Logical Volumes
When you add a hard drive to your system, you don't really get to pick its name. You get /dev/sda, /dev/sdb and so on. When you create logical volumes, however, you decide what you want the devices to be called. You also get to decide how large each "drive" is as you carve it out of the larger volume group. It's good to note here that if you make your logical volumes too small, it's very easy to expand them later, so don't worry too much about planning for long-term potential needs. If you need more space later, you can just add it. To create your logical volumes, type:
$ sudo lvcreate -L 5G -n 5gig my_volume_group
Logical volume "5gig" created
Then to see what happened behind the scenes, type:
$ sudo lvdisplay
--- Logical volume ---
LV Path /dev/my_volume_group/5gig
LV Name 5gig
VG Name my_volume_group
LV UUID 3MxOB0-ce5o-yvBD-YORT-52qV-j8HJ-oDru2G
LV Write Access read/write
LV Status available
# open 0
LV Size 5.0 GiB
Current LE 5753
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 252:0
Notice how nice and clean the device-naming system is with LVM. It's
important to run the lvdisplay
command, however, to make sure you know
the mapped device name. Many systems use symbolic linking in an attempt
to make the device's virtual locations easier to find, but I think that
adds a layer of confusion for folks trying to understand what's really
going on.
Look, a New (Virtual) Hard Drive!
Once you've successfully created your logical volumes, it's just a matter of using them as block devices. If you need a filesystem to mount as your /home directory, just do this:
$ sudo mkfs.ext4 /dev/my_volume_group/5gig
$ sudo mount -t ext4 /dev/my_volume_group/5gig /home
And, your /home directory will be a whopping 5GB in size, but fully expandable, thanks to LVM. (Obviously, if you really want to mount your logical volume as your home directory, you should add an entry to /etc/fstab so it mounts on boot.) From the standpoint of your Linux system, however, /dev/my_volume/5gig is a block device similar to any hard drive you might plug in. You can use it as swap, format it like you did above, or even encrypt it and mount it somewhere as an encrypted partition.
That Was a Lot of Work, Why Again?
I know, in this little example, you've done nothing but create a JBOD (Just a Bunch Of Disks) type system, which will completely fail if you lose even one drive. The power of LVM isn't fully realized until down the road when you want to expand your logical volumes without migrating data. Or, when you want to take an LVM snapshot of your drive so you can roll back to an instantaneous backup when an upgrade fails. Or, when you replace a small drive with a fast RAID array and want to migrate the data quietly to your new PV.
The Logical Volume Manager is a system that abstracts storage devices. It does add a layer of complexity to your system, I won't lie, but the trade-off is significant. It may complicate your system a bit more, but it also simplifies your work a great deal when you have to deal with storage in the future.
You Keep Talking about the Future...
Hopefully at this point, you see LVM isn't a complete waste of time. When the time comes, what sorts of advantages will LVM provide? Here's a quick off-the-top-of-my-head list you might want to check out:
-
Move Logical Volumes from old, slow PVs to new, fast PVs, on the fly.
-
Resize Logical Volumes, filling more space in the Volume Group.
-
Stripe data across PVs in a VG for increased performance.
-
Resize Volume Groups by adding or subtracting physical volumes.
-
Take a snapshot of any Logical Volume, which can be restored later.
One of my favorite uses for LVM in production is to take an LVM snapshot before an upgrade. If something goes wrong, I can just revert back to the snapshot. Once you start thinking about all the possibilities LVM offers, you'll wonder why you waited so long!