Tag: linux

Managing LVM with PVMOVE – Part 2

After a little introduction about LVM from the article Managing LVM with PVMOVE – Part 1, its time to get into the details of the pvmove command. Based on the scenario and constraints described in part 1, that I will elaborate on the pvmove operation here.


Before handling any operation, do your precheck tasks such as verification of the state of the server, URLs, services and application running,  the port they are listening etc.. This is done to be able to handle complicated tasks both at the system and application level. However, in respect of the pvmove operation, I would recommend you to fire the vgs, pvs and lvs commands as well as a fdisk -l to check for the physical disk state. Do a df -h and if possible; an lsblk to list all blocks for future references. On CentOS / RedHat you need to install the package util-linux to be able to use lsblk which is very interesting.



Screenshot from 2015-09-30 21:07:28

Let us say we have a disk of 100G [lets called it sdb] and we have to replace it with another disk of 20G [sdc]. We assume that there is 10G of data being used out of the 100G hard disk which looks reasonable and possible to be handled by the 20G hard disk that we have planned to shift to. On our Sdb we have 2 LV lets call it lvsql that is being used by MySQL having the database there and lvhome handling the rest of the server applications. See illustration from the diagram on the right. Both LVs are found in the same VG called VGNAME.

So you might ask yourself how come if you perform a df -h on your machine you find that lvsql total size is 15G and that of lvhome is 80G making a total of 95G when the hard disk is a 100G [As described on the diagram above]. Is there 5G missing? The answer is no. When you fire the command pvdisplay, you will notice that the “PE Size” had consumed 5 GB. For example on this screenshot on the left, the PE Size is 4MB.

Usually, the 5Gb missing will be allocated there which means that the PE Size is used for a specific purpose for example in our situation a back up of the MySQL database or other processes. If the missing size is not found it means that its still in the VG and has not been allocated to the LV. It’s important to allocate some additional space there. So before start do a pvdisplay, lvdisplay, and vgdisplay as well which are important. We now have our sde hard disk as described in this picture below.



Screenshot from 2015-09-30 21:05:13

How to start? It’s up to you to decide which lv you want to allocate more space as you have control over the vg. You can also add another physical disk and extend the vgname and resize the lvsql since a database usually grows in size.

Do your pre-check tasks as described.

    1. Stop all applications running such as MySQL and Atop in our case. You can use the lsof command to check if there are still processes and application writing on the disk.
    1. Once all applications have been stopped you can type the mount command to check the presence of the partitions as well as you have to confirm that there is no application writing on the disk – use lsof or fuser.
    1. Comment the two lines related to the 2 vg partitions in the /etc/fstab file. Usually it would be the lines /dev/vgname/lvsql and /dev/vgname/lvhome. You can now umount the disk. You would notice that it would not be possible to umount the partitions if ever an application is writing on it. Another issue is that if you have ssh running on the machine, you need to stop it! Then, how do you ssh on the machine? Use the console from vSphere or if it’s a physical machine boot it with a live cd.
    1. Next step is to do a lvreduce of the lv if possible according to the size. In our case 5GB being used out of 80. Do a lvreduce -L 7GB –resizefs /dev/vgname/lvhome. This is done because when you will move the pv it will be faster.  The bigger the lv the more time it takes.
    1. Once all lv size has been reduced to the maximum, add the disk sdc. Make sure it gets detected if you are on VMware. Use the fdisk command to check and compare from your precheck test.


  1. Now create from the sdc a pv The command is pvcreate /dev/sdc. This means that you created a pv from the disk you have added.
  2. After the pv has been created extend the vg called vgname from the old disk (sdb) by using the disk sdc which you just added to the same old vg called vgname. Command is vgextend vgname /dev/sdc
  3. Now the magic starts, fire a pvmove /dev/sdb /dev/sdc – This means that you are moving the pv allocated to the vgname belonging to the PEs of hard disk sdb into sdc.
  4. When the pvmove is completed, you just need to do a vgreduce vgname /dev/sdb. When you launch the vgreduce it will throw out the old disk out from the VG as you have reduced it completely. You can now remove the old disk.
  5. Since you have reduced the lvhome you will notice that lvhome is at 7GB instead of 10G as we have reduced size in step 4 to accelerate the pvmove process. The step is to lvresize -l +100%FREE /dev/vgname/lvhome. You will notice that the PE Size is intact as we had not resized the lvsql.
  6. You can now do a /sbin/resize2fs /dev/vgname/lvhome. Uncomment back the lines on fstab, mount -av and restart your applications.



Congrats you just did a pvmove. Comments below so that we can clear any ambiguities.

Managing LVM with PVMOVE – Part 1

One of the challenging issues that I have encountered is the manipulation of LVM – Logical Volume Management on virtual servers. Whilst writing this article, I noticed that I have to split it into parts as it looks bulky in some sort. Once you have understood the logic of the LVM, things get pretty easy to deal with. I will elaborate some details of LVM, and will surely get into some brief real-life experience that is important to grasp. Let’s take an example of a disk where there are some applications running like MySQL, Apache, some monitoring tools like Atop and Htop which are writing on the disk and we have to shrink that very disk or replace it with another disk. Let’s also assume that the server is running on an ESX host and the operation will be carried out through the VMware  VCenter. How do you shrink a disk having its application generating IOs? How do you replace a disk with a smaller one having its data on an LVM?

In brief, here is what I have understood from what is LVM – Logical Volume Management.

We have Physical Volume (PV), Volume Groups (VG) and Logical Volume  (LV). These terms are a must to understand before proceeding with Logical Volume Management operations.

PV- These are hard disks or even Hard disk partitions. These PVs have logical unit numbers. Each PV is or can be composed of chunks called PEs (Physical Extents)

VG – The Volume Group is the highest level of abstraction used with the LVM. Does this term look complicated? I would say no. In the field of Software Engineering, there are two terms that are usually used that is modeling and meta-modeling. Just understand it like this if you are not familiar with software engineering. Modeling means one step or one level removed from reality whilst Meta-Modelling means modeling at a higher level of logic. So basically, it looks like some sort of higher modeling happened at the level of the VG. 

LV – The logical volume is a product of the VG. They are disks which are not restricted to the physical disk size and it can be resized or even move easily without even having your application to be stopped or having your system unmounted. Of course, I need to do more research to get into a more deeper explanation of the PV, PE, VG, and LV. Lets now see an explanation through my horrible drawing skills.


Screenshot from 2015-09-29 21:04:46From the Diagram we conclude that :

  • PVs looks like hard disks divided into chunks called PEs
  • The VGs are just a high level of abstraction that should look from the above.
  • VGs are created by combining PVs.

If you have access to a Linux machine having LVM configured and some VG have already been created, you can start firing these commands to have an overview of the PV, VG, and LV

  • pvs – Report information about physical volume
  • vgs – Information about volume groups
  • lvs – Information about logical volumes

Those physical disks can be picked up from the datastores, where RAID is configured. This act as another layer of security to be able to handle disk failures or loss of data at all cost.

Screenshot from 2015-09-29 21:37:28


On Linux, if you type vg or lv or pv press tab twice you will have an idea all the commands that exist and possibilities of manipulation. On part2 of this article, I will take an example of the pvmove command and actions that could be done to minimize impact before carrying out a pvmove operation.

Part2 of the article is on this link

Managing and Analysing disk usage on Linux

Disk usage management and analysis on servers can sometimes be extremely annoying even disastrous if proper management and analysis are not carried out regularly. However, you can also brew some scripts to sort out the mess. In large companies, monitoring systems are usually set up to handle the situation, especially where there are several servers with consequent sized partitions allocated for let us say /var/log directory. An inventory tool such as the OCS Inventory tool can also be used to monitor a large number of servers.

diskusage

This blog post will be updated as my own notebook to remember the commands used during the management of disk usage. You can update me some tricks and tips. I will update the article here 🙂

Managing disk space with ‘find’ command

1. To find the total size of a particular directory of more than 1000 days

find . -mtime +1000 -exec du -csh {} + | grep total$   

2. Find in the partition/files of more than 50 M and do an ls which is longlisted and human readable.

find / -xdev -type f -size +50M -exec ls -lh '{}' ';' 

3. Find in the /tmp partition every file or directory with mtime of more than 1 day and delete same.

find /tmp -name "develop*" -mtime +1 -exec rm -rf {} \; 

4.Count from the file /tmp/uniqDirectory in the /home directory (uniqDirectory), every directory having the same unique name.

find /home > /tmp/uniqDirectory && for i in $(cat /tmp/uniqDirectory);do echo $i; ls -l /home/test/$i |wc -l;done

5. Find from /tmp all files having the extension .sh or .jar and calculate the total size.

find . -type f \( -iname "*.sh" -or -iname "*.jar" \) -exec du -csh {} + | grep total$

6. Find all files in /tmp by checking which files are not being used by any process and delete same if not being used.

find /tmp -type f | while read files ; do fuser -s $files || rm -rf  $files ; done

7. Once I encountered a VM during an incident which had turned on read-only mode after an intervention on SAN. After several hours, the disk was back on read-write mode. At that material time, there were several processes using the disk such as Screen, ATP, NFS etc.. I noticed that the disk usage turn to 90 % on /var partition despite du command does not show the same amount consumed. To troubleshoot the issue, the following command came handy which showed the process that has locked the disk. After restarting the service, it was back to 2%.

lsof | grep "/var" | grep deleted

Another interesting issue that you might encounter is a sudden increase of the log size which might be caused by an application due to some failure issues Example a sudden increase of binary logs generated by MySQL or a core dump generated!

Let’s say we have a crash package installed on a server. The crash package will generate a core dump for analysis as to why the application was crashed. This is sometimes annoying as you cannot expect as to when an application is going to fail especially if you have many developers, system administrators working on the same platform. I would suggest a script which would send a mail to a particular user if ever the core dump has been generated. I placed a script here on GitHub to handle such a situation.

Naturally, a log rotate is of great help as well as crons to purge certain temporary logs. The “du” command is helpful but when it comes to choose and pick for a particular reason, you will need to handle the situation with the find command.

Tips:

    • You should be extremely careful when deleting files from find command. Imagine some replication log file is being used by an Oracle Database Server which has been deleted. This would be disastrous.
    • lsof on a mount point can also be interesting to troubleshoot disk usage.
    • Also, make sure that you see the content of the log as any file can be named as the *log or *_log